All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL
@ 2023-02-17  0:58 Umesh Nerlige Ramappa
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 1/9] drm/i915/perf: Drop wakeref on GuC RC error Umesh Nerlige Ramappa
                   ` (11 more replies)
  0 siblings, 12 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

The OAM unit captures OA reports specific to the media engines. Add support to
program the OAM unit on media tile on MTL.

The OAM unit is selected by passing the class:instance of a media engine to perf
parameters. Corresponding UMD changes are posted to the igt-dev repo as part of
supporting the GPUvis tool.

v2: Incorporate review feedback (Jani, Ashutosh)

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Test-with: 20230215004648.2100655-1-umesh.nerlige.ramappa@intel.com
Cc: "Ashutosh Dixit <ashutosh.dixit@intel.com>"
Cc: "Lionel G Landwerlin <lionel.g.landwerlin@linux.intel.com>"
Cc: "Joonas Lahtinen <joonas.lahtinen@linux.intel.com>"
Cc: "Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>"

Chris Wilson (1):
  drm/i915/perf: Drop wakeref on GuC RC error

Umesh Nerlige Ramappa (8):
  drm/i915/perf: Add helper to check supported OA engines
  drm/i915/perf: Validate OA sseu config outside switch
  drm/i915/perf: Group engines into respective OA groups
  drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM
  drm/i915/perf: Parse 64bit report header formats correctly
  drm/i915/perf: Handle non-power-of-2 reports
  drm/i915/perf: Add engine class instance parameters to perf
  drm/i915/perf: Add support for OA media units

 drivers/gpu/drm/i915/gt/intel_engine_types.h |   4 +
 drivers/gpu/drm/i915/gt/intel_sseu.c         |   3 +-
 drivers/gpu/drm/i915/i915_driver.c           |   4 +-
 drivers/gpu/drm/i915/i915_drv.h              |   2 +
 drivers/gpu/drm/i915/i915_pci.c              |   1 +
 drivers/gpu/drm/i915/i915_perf.c             | 626 +++++++++++++++----
 drivers/gpu/drm/i915/i915_perf.h             |   2 +-
 drivers/gpu/drm/i915/i915_perf_oa_regs.h     |  78 +++
 drivers/gpu/drm/i915/i915_perf_types.h       | 103 ++-
 drivers/gpu/drm/i915/intel_device_info.h     |   1 +
 include/uapi/drm/i915_drm.h                  |  24 +
 11 files changed, 731 insertions(+), 117 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 1/9] drm/i915/perf: Drop wakeref on GuC RC error
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 2/9] drm/i915/perf: Add helper to check supported OA engines Umesh Nerlige Ramappa
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris.p.wilson@linux.intel.com>

If we fail to adjust the GuC run-control on opening the perf stream,
make sure we unwind the wakeref just taken.

v2: Retain old goto label names (Ashutosh)

Fixes: 01e742746785 ("drm/i915/guc: Support OA when Wa_16011777198 is enabled")
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c       | 14 +++++++++-----
 drivers/gpu/drm/i915/i915_perf_types.h |  6 ++++++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 824a34ec0b83..283a4a3c6862 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1592,9 +1592,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	/*
 	 * Wa_16011777198:dg2: Unset the override of GUCRC mode to enable rc6.
 	 */
-	if (intel_uc_uses_guc_rc(&gt->uc) &&
-	    (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) ||
-	     IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0)))
+	if (stream->override_gucrc)
 		drm_WARN_ON(&gt->i915->drm,
 			    intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc));
 
@@ -3305,8 +3303,10 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 		if (ret) {
 			drm_dbg(&stream->perf->i915->drm,
 				"Unable to override gucrc mode\n");
-			goto err_config;
+			goto err_gucrc;
 		}
+
+		stream->override_gucrc = true;
 	}
 
 	ret = alloc_oa_buffer(stream);
@@ -3345,11 +3345,15 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	free_oa_buffer(stream);
 
 err_oa_buf_alloc:
-	free_oa_configs(stream);
+	if (stream->override_gucrc)
+		intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc);
 
+err_gucrc:
 	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
 	intel_engine_pm_put(stream->engine);
 
+	free_oa_configs(stream);
+
 err_config:
 	free_noa_wait(stream);
 
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
index ca150b7af3f2..e36f046fe2b6 100644
--- a/drivers/gpu/drm/i915/i915_perf_types.h
+++ b/drivers/gpu/drm/i915/i915_perf_types.h
@@ -316,6 +316,12 @@ struct i915_perf_stream {
 	 * buffer should be checked for available data.
 	 */
 	u64 poll_oa_period;
+
+	/**
+	 * @override_gucrc: GuC RC has been overridden for the perf stream,
+	 * and we need to restore the default configuration on release.
+	 */
+	bool override_gucrc:1;
 };
 
 /**
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 2/9] drm/i915/perf: Add helper to check supported OA engines
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 1/9] drm/i915/perf: Drop wakeref on GuC RC error Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 3/9] drm/i915/perf: Validate OA sseu config outside switch Umesh Nerlige Ramappa
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

With an intention to add more engines that are supported by OA, add
helper to check for supported OA engines.

v2: (Ashutosh)
- Update commit message
- Drop virtual engine check since we support only one render engine

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 283a4a3c6862..b0e1acbe90fc 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1570,6 +1570,16 @@ free_noa_wait(struct i915_perf_stream *stream)
 	i915_vma_unpin_and_release(&stream->noa_wait, 0);
 }
 
+static bool engine_supports_oa(const struct intel_engine_cs *engine)
+{
+	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
+
+	switch (platform) {
+	default:
+		return engine->class == RENDER_CLASS;
+	}
+}
+
 static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 {
 	struct i915_perf *perf = stream->perf;
@@ -2505,7 +2515,7 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
 	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
 		GEM_BUG_ON(ce == ce->engine->kernel_context);
 
-		if (ce->engine->class != RENDER_CLASS)
+		if (!engine_supports_oa(ce->engine))
 			continue;
 
 		/* Otherwise OA settings will be set upon first use */
@@ -2656,7 +2666,7 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
 	for_each_uabi_engine(engine, i915) {
 		struct intel_context *ce = engine->kernel_context;
 
-		if (engine->class != RENDER_CLASS)
+		if (!engine_supports_oa(ce->engine))
 			continue;
 
 		regs[0].value = intel_sseu_make_rpcs(engine->gt, &ce->sseu);
@@ -3369,7 +3379,7 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
 {
 	struct i915_perf_stream *stream;
 
-	if (engine->class != RENDER_CLASS)
+	if (!engine_supports_oa(engine))
 		return;
 
 	/* perf.exclusive_stream serialised by lrc_configure_all_contexts() */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 3/9] drm/i915/perf: Validate OA sseu config outside switch
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 1/9] drm/i915/perf: Drop wakeref on GuC RC error Umesh Nerlige Ramappa
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 2/9] drm/i915/perf: Add helper to check supported OA engines Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-17  1:10   ` Dixit, Ashutosh
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 4/9] drm/i915/perf: Group engines into respective OA groups Umesh Nerlige Ramappa
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

Once OA supports media engine class:instance, the engine can only be
validated outside the switch since class and instance parameters are
separate entities. Since OA sseu config depends on engine
class:instance, validate OA sseu config outside the switch.

v2: (Ashutosh)
- Clarify commit message
- Use drm_dbg instead of DRM_DEBUG
- Reorder stack variables

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index b0e1acbe90fc..1229f65534e2 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3950,7 +3950,9 @@ static int read_properties_unlocked(struct i915_perf *perf,
 				    u32 n_props,
 				    struct perf_open_properties *props)
 {
+	struct drm_i915_gem_context_param_sseu user_sseu;
 	u64 __user *uprop = uprops;
+	bool config_sseu = false;
 	u32 i;
 	int ret;
 
@@ -4079,8 +4081,6 @@ static int read_properties_unlocked(struct i915_perf *perf,
 			props->hold_preemption = !!value;
 			break;
 		case DRM_I915_PERF_PROP_GLOBAL_SSEU: {
-			struct drm_i915_gem_context_param_sseu user_sseu;
-
 			if (GRAPHICS_VER_FULL(perf->i915) >= IP_VER(12, 50)) {
 				drm_dbg(&perf->i915->drm,
 					"SSEU config not supported on gfx %x\n",
@@ -4095,14 +4095,7 @@ static int read_properties_unlocked(struct i915_perf *perf,
 					"Unable to copy global sseu parameter\n");
 				return -EFAULT;
 			}
-
-			ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
-			if (ret) {
-				drm_dbg(&perf->i915->drm,
-					"Invalid SSEU configuration\n");
-				return ret;
-			}
-			props->has_sseu = true;
+			config_sseu = true;
 			break;
 		}
 		case DRM_I915_PERF_PROP_POLL_OA_PERIOD:
@@ -4122,6 +4115,16 @@ static int read_properties_unlocked(struct i915_perf *perf,
 		uprop += 2;
 	}
 
+	if (config_sseu) {
+		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
+		if (ret) {
+			drm_dbg(&perf->i915->drm,
+				"Invalid SSEU configuration\n");
+			return ret;
+		}
+		props->has_sseu = true;
+	}
+
 	return 0;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 4/9] drm/i915/perf: Group engines into respective OA groups
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (2 preceding siblings ...)
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 3/9] drm/i915/perf: Validate OA sseu config outside switch Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-22 21:52   ` Dixit, Ashutosh
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 5/9] drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM Umesh Nerlige Ramappa
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

Now that we may have multiple OA units in a single GT as well as on
separate GTs, create an engine group that maps to a single OA unit.

v2: (Jani)
- Drop warning on ENOMEM
- Reorder patch in the series

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h |   4 +
 drivers/gpu/drm/i915/gt/intel_sseu.c         |   3 +-
 drivers/gpu/drm/i915/i915_perf.c             | 124 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_perf_types.h       |  51 +++++++-
 4 files changed, 169 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 4fd54fb8810f..8a8b0dce241b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -53,6 +53,8 @@ struct intel_gt;
 struct intel_ring;
 struct intel_uncore;
 struct intel_breadcrumbs;
+struct intel_engine_cs;
+struct i915_perf_group;
 
 typedef u32 intel_engine_mask_t;
 #define ALL_ENGINES ((intel_engine_mask_t)~0ul)
@@ -603,6 +605,8 @@ struct intel_engine_cs {
 	} props, defaults;
 
 	I915_SELFTEST_DECLARE(struct fault_attr reset_timeout);
+
+	struct i915_perf_group *oa_group;
 };
 
 static inline bool
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 6c6198a257ac..1141f875f5bd 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -6,6 +6,7 @@
 #include <linux/string_helpers.h>
 
 #include "i915_drv.h"
+#include "i915_perf_types.h"
 #include "intel_engine_regs.h"
 #include "intel_gt_regs.h"
 #include "intel_sseu.h"
@@ -677,7 +678,7 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt,
 	 * If i915/perf is active, we want a stable powergating configuration
 	 * on the system. Use the configuration pinned by i915/perf.
 	 */
-	if (gt->perf.exclusive_stream)
+	if (gt->perf.group && gt->perf.group[PERF_GROUP_OAG].exclusive_stream)
 		req_sseu = &gt->perf.sseu;
 
 	slices = hweight8(req_sseu->slice_mask);
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 1229f65534e2..37c4cc44d68c 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1584,8 +1584,9 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 {
 	struct i915_perf *perf = stream->perf;
 	struct intel_gt *gt = stream->engine->gt;
+	struct i915_perf_group *g = stream->engine->oa_group;
 
-	if (WARN_ON(stream != gt->perf.exclusive_stream))
+	if (WARN_ON(stream != g->exclusive_stream))
 		return;
 
 	/*
@@ -1594,7 +1595,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	 *
 	 * See i915_oa_init_reg_state() and lrc_configure_all_contexts()
 	 */
-	WRITE_ONCE(gt->perf.exclusive_stream, NULL);
+	WRITE_ONCE(g->exclusive_stream, NULL);
 	perf->ops.disable_metric_set(stream);
 
 	free_oa_buffer(stream);
@@ -3192,6 +3193,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 {
 	struct drm_i915_private *i915 = stream->perf->i915;
 	struct i915_perf *perf = stream->perf;
+	struct i915_perf_group *g;
 	struct intel_gt *gt;
 	int ret;
 
@@ -3202,6 +3204,12 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	}
 	gt = props->engine->gt;
 
+	g = props->engine->oa_group;
+	if (!g) {
+		DRM_DEBUG("Perf group invalid\n");
+		return -EINVAL;
+	}
+
 	/*
 	 * If the sysfs metrics/ directory wasn't registered for some
 	 * reason then don't let userspace try their luck with config
@@ -3231,7 +3239,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	 * counter reports and marshal to the appropriate client
 	 * we currently only allow exclusive access
 	 */
-	if (gt->perf.exclusive_stream) {
+	if (g->exclusive_stream) {
 		drm_dbg(&stream->perf->i915->drm,
 			"OA unit already in use\n");
 		return -EBUSY;
@@ -3326,7 +3334,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	stream->ops = &i915_oa_stream_ops;
 
 	stream->engine->gt->perf.sseu = props->sseu;
-	WRITE_ONCE(gt->perf.exclusive_stream, stream);
+	WRITE_ONCE(g->exclusive_stream, stream);
 
 	ret = i915_perf_stream_enable_sync(stream);
 	if (ret) {
@@ -3349,7 +3357,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	return 0;
 
 err_enable:
-	WRITE_ONCE(gt->perf.exclusive_stream, NULL);
+	WRITE_ONCE(g->exclusive_stream, NULL);
 	perf->ops.disable_metric_set(stream);
 
 	free_oa_buffer(stream);
@@ -3378,12 +3386,13 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
 			    const struct intel_engine_cs *engine)
 {
 	struct i915_perf_stream *stream;
+	struct i915_perf_group *g = engine->oa_group;
 
-	if (!engine_supports_oa(engine))
+	if (!g)
 		return;
 
 	/* perf.exclusive_stream serialised by lrc_configure_all_contexts() */
-	stream = READ_ONCE(engine->gt->perf.exclusive_stream);
+	stream = READ_ONCE(g->exclusive_stream);
 	if (stream && GRAPHICS_VER(stream->perf->i915) < 12)
 		gen8_update_reg_state_unlocked(ce, stream);
 }
@@ -4753,6 +4762,95 @@ static struct ctl_table oa_table[] = {
 	{}
 };
 
+static u32 __num_perf_groups_per_gt(struct intel_gt *gt)
+{
+	enum intel_platform platform = INTEL_INFO(gt->i915)->platform;
+
+	switch (platform) {
+	default:
+		return 1;
+	}
+}
+
+static u32 __oa_engine_group(struct intel_engine_cs *engine)
+{
+	if (!engine_supports_oa(engine))
+		return PERF_GROUP_INVALID;
+
+	switch (engine->class) {
+	case RENDER_CLASS:
+		return PERF_GROUP_OAG;
+
+	default:
+		return PERF_GROUP_INVALID;
+	}
+}
+
+static void oa_init_groups(struct intel_gt *gt)
+{
+	int i, num_groups = gt->perf.num_perf_groups;
+	struct i915_perf *perf = &gt->i915->perf;
+
+	for (i = 0; i < num_groups; i++) {
+		struct i915_perf_group *g = &gt->perf.group[i];
+
+		/* Fused off engines can result in a group with num_engines == 0 */
+		if (g->num_engines == 0)
+			continue;
+
+		/* Set oa_unit_ids now to ensure ids remain contiguous. */
+		g->oa_unit_id = perf->oa_unit_ids++;
+
+		g->gt = gt;
+	}
+}
+
+static int oa_init_gt(struct intel_gt *gt)
+{
+	u32 num_groups = __num_perf_groups_per_gt(gt);
+	struct intel_engine_cs *engine;
+	struct i915_perf_group *g;
+	intel_engine_mask_t tmp;
+
+	g = kcalloc(num_groups, sizeof(*g), GFP_KERNEL);
+	if (!g)
+		return -ENOMEM;
+
+	for_each_engine_masked(engine, gt, ALL_ENGINES, tmp) {
+		u32 index;
+
+		index = __oa_engine_group(engine);
+		if (index < num_groups) {
+			g[index].engine_mask |= BIT(engine->id);
+			g[index].num_engines++;
+			engine->oa_group = &g[index];
+		} else {
+			engine->oa_group = NULL;
+		}
+	}
+
+	gt->perf.num_perf_groups = num_groups;
+	gt->perf.group = g;
+
+	oa_init_groups(gt);
+
+	return 0;
+}
+
+static int oa_init_engine_groups(struct i915_perf *perf)
+{
+	struct intel_gt *gt;
+	int i, ret;
+
+	for_each_gt(gt, perf->i915, i) {
+		ret = oa_init_gt(gt);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 static void oa_init_supported_formats(struct i915_perf *perf)
 {
 	struct drm_i915_private *i915 = perf->i915;
@@ -4919,7 +5017,7 @@ void i915_perf_init(struct drm_i915_private *i915)
 
 	if (perf->ops.enable_metric_set) {
 		struct intel_gt *gt;
-		int i;
+		int i, ret;
 
 		for_each_gt(gt, i915, i)
 			mutex_init(&gt->perf.lock);
@@ -4958,6 +5056,11 @@ void i915_perf_init(struct drm_i915_private *i915)
 
 		perf->i915 = i915;
 
+		ret = oa_init_engine_groups(perf);
+		if (ret)
+			drm_err(&i915->drm,
+				"OA initialization failed %d\n", ret);
+
 		oa_init_supported_formats(perf);
 	}
 }
@@ -4986,10 +5089,15 @@ void i915_perf_sysctl_unregister(void)
 void i915_perf_fini(struct drm_i915_private *i915)
 {
 	struct i915_perf *perf = &i915->perf;
+	struct intel_gt *gt;
+	int i;
 
 	if (!perf->i915)
 		return;
 
+	for_each_gt(gt, perf->i915, i)
+		kfree(gt->perf.group);
+
 	idr_for_each(&perf->metrics_idr, destroy_config, perf);
 	idr_destroy(&perf->metrics_idr);
 
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
index e36f046fe2b6..ce99551ad0fd 100644
--- a/drivers/gpu/drm/i915/i915_perf_types.h
+++ b/drivers/gpu/drm/i915/i915_perf_types.h
@@ -17,6 +17,7 @@
 #include <linux/wait.h>
 #include <uapi/drm/i915_drm.h>
 
+#include "gt/intel_engine_types.h"
 #include "gt/intel_sseu.h"
 #include "i915_reg_defs.h"
 #include "intel_wakeref.h"
@@ -30,6 +31,13 @@ struct i915_vma;
 struct intel_context;
 struct intel_engine_cs;
 
+enum {
+	PERF_GROUP_OAG = 0,
+
+	PERF_GROUP_MAX,
+	PERF_GROUP_INVALID = U32_MAX,
+};
+
 struct i915_oa_format {
 	u32 format;
 	int size;
@@ -390,6 +398,35 @@ struct i915_oa_ops {
 	u32 (*oa_hw_tail_read)(struct i915_perf_stream *stream);
 };
 
+struct i915_perf_group {
+	/*
+	 * @type: Identifier for the OA unit.
+	 */
+	u32 oa_unit_id;
+
+	/*
+	 * @gt: gt that this group belongs to
+	 */
+	struct intel_gt *gt;
+
+	/*
+	 * @exclusive_stream: The stream currently using the OA unit. This is
+	 * sometimes accessed outside a syscall associated to its file
+	 * descriptor.
+	 */
+	struct i915_perf_stream *exclusive_stream;
+
+	/*
+	 * @num_engines: The number of engines using this OA buffer.
+	 */
+	u32 num_engines;
+
+	/*
+	 * @engine_mask: A mask of engines using a single OA buffer.
+	 */
+	intel_engine_mask_t engine_mask;
+};
+
 struct i915_perf_gt {
 	/*
 	 * Lock associated with anything below within this structure.
@@ -402,12 +439,15 @@ struct i915_perf_gt {
 	 */
 	struct intel_sseu sseu;
 
+	/**
+	 * @num_perf_groups: number of perf groups per gt.
+	 */
+	u32 num_perf_groups;
+
 	/*
-	 * @exclusive_stream: The stream currently using the OA unit. This is
-	 * sometimes accessed outside a syscall associated to its file
-	 * descriptor.
+	 * @group: list of OA groups - one for each OA buffer.
 	 */
-	struct i915_perf_stream *exclusive_stream;
+	struct i915_perf_group *group;
 };
 
 struct i915_perf {
@@ -461,6 +501,9 @@ struct i915_perf {
 	unsigned long format_mask[FORMAT_MASK_SIZE];
 
 	atomic64_t noa_programming_delay;
+
+	/* oa unit ids */
+	u32 oa_unit_ids;
 };
 
 #endif /* _I915_PERF_TYPES_H_ */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 5/9] drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (3 preceding siblings ...)
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 4/9] drm/i915/perf: Group engines into respective OA groups Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-17  2:04   ` Dixit, Ashutosh
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 6/9] drm/i915/perf: Parse 64bit report header formats correctly Umesh Nerlige Ramappa
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

i915_perf_init can fail due to OOM. Fail driver init if i915_perf_init
fails.

v2: (Jani)
- Reorder patch in the series

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_driver.c | 4 +++-
 drivers/gpu/drm/i915/i915_perf.c   | 8 ++++++--
 drivers/gpu/drm/i915/i915_perf.h   | 2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index 0c0ae3eabb4b..998ca41c9713 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -477,7 +477,9 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
 	if (ret)
 		return ret;
 
-	i915_perf_init(dev_priv);
+	ret = i915_perf_init(dev_priv);
+	if (ret)
+		return ret;
 
 	ret = i915_ggtt_probe_hw(dev_priv);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 37c4cc44d68c..3306653c0b85 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -4941,7 +4941,7 @@ static void i915_perf_init_info(struct drm_i915_private *i915)
  * Note: i915-perf initialization is split into an 'init' and 'register'
  * phase with the i915_perf_register() exposing state to userspace.
  */
-void i915_perf_init(struct drm_i915_private *i915)
+int i915_perf_init(struct drm_i915_private *i915)
 {
 	struct i915_perf *perf = &i915->perf;
 
@@ -5057,12 +5057,16 @@ void i915_perf_init(struct drm_i915_private *i915)
 		perf->i915 = i915;
 
 		ret = oa_init_engine_groups(perf);
-		if (ret)
+		if (ret) {
 			drm_err(&i915->drm,
 				"OA initialization failed %d\n", ret);
+			return ret;
+		}
 
 		oa_init_supported_formats(perf);
 	}
+
+	return 0;
 }
 
 static int destroy_config(int id, void *p, void *data)
diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h
index f96e09a4af04..253637651d5e 100644
--- a/drivers/gpu/drm/i915/i915_perf.h
+++ b/drivers/gpu/drm/i915/i915_perf.h
@@ -18,7 +18,7 @@ struct i915_oa_config;
 struct intel_context;
 struct intel_engine_cs;
 
-void i915_perf_init(struct drm_i915_private *i915);
+int i915_perf_init(struct drm_i915_private *i915);
 void i915_perf_fini(struct drm_i915_private *i915);
 void i915_perf_register(struct drm_i915_private *i915);
 void i915_perf_unregister(struct drm_i915_private *i915);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 6/9] drm/i915/perf: Parse 64bit report header formats correctly
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (4 preceding siblings ...)
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 5/9] drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-21 22:14   ` Dixit, Ashutosh
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports Umesh Nerlige Ramappa
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

Now that OA formats come in flavor of 64 bit reports, the report header
has 64 bit report-id, timestamp, context-id and gpu-ticks fields. When
filtering these reports, use the right width for these fields.

Note that upper dword of context id is reserved, so squash lower dword
only.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c       | 105 ++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_perf_types.h |   6 ++
 2 files changed, 92 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 3306653c0b85..9715b964aa1e 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -441,6 +441,75 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream)
 	return oastatus1 & GEN7_OASTATUS1_TAIL_MASK;
 }
 
+#define oa_report_header_64bit(__s) \
+	((__s)->oa_buffer.format->header == HDR_64_BIT)
+
+static inline u64
+oa_report_id(struct i915_perf_stream *stream, void *report)
+{
+	return oa_report_header_64bit(stream) ? *(u64 *)report : *(u32 *)report;
+}
+
+static inline u64
+oa_report_reason(struct i915_perf_stream *stream, void *report)
+{
+	return (oa_report_id(stream, report) >> OAREPORT_REASON_SHIFT) &
+	       (GRAPHICS_VER(stream->perf->i915) == 12 ?
+		OAREPORT_REASON_MASK_EXTENDED :
+		OAREPORT_REASON_MASK);
+}
+
+static inline void
+oa_report_id_clear(struct i915_perf_stream *stream, u32 *report)
+{
+	if (oa_report_header_64bit(stream))
+		*(u64 *)report = 0;
+	else
+		*report = 0;
+}
+
+static inline bool
+oa_report_ctx_invalid(struct i915_perf_stream *stream, void *report)
+{
+	return !(oa_report_id(stream, report) &
+	       stream->perf->gen8_valid_ctx_bit) &&
+	       GRAPHICS_VER(stream->perf->i915) <= 11;
+}
+
+static inline u64
+oa_timestamp(struct i915_perf_stream *stream, void *report)
+{
+	return oa_report_header_64bit(stream) ?
+		*((u64 *)report + 1) :
+		*((u32 *)report + 1);
+}
+
+static inline void
+oa_timestamp_clear(struct i915_perf_stream *stream, u32 *report)
+{
+	if (oa_report_header_64bit(stream))
+		*(u64 *)&report[2] = 0;
+	else
+		report[1] = 0;
+}
+
+static inline u32
+oa_context_id(struct i915_perf_stream *stream, u32 *report)
+{
+	u32 ctx_id = oa_report_header_64bit(stream) ? report[4] : report[2];
+
+	return ctx_id & stream->specific_ctx_id_mask;
+}
+
+static inline void
+oa_context_id_squash(struct i915_perf_stream *stream, u32 *report)
+{
+	if (oa_report_header_64bit(stream))
+		report[4] = INVALID_CTX_ID;
+	else
+		report[2] = INVALID_CTX_ID;
+}
+
 /**
  * oa_buffer_check_unlocked - check for data and update tail ptr state
  * @stream: i915 stream instance
@@ -521,9 +590,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 		 * If not : (╯°□°)╯︵ ┻━┻
 		 */
 		while (OA_TAKEN(tail, aged_tail) >= report_size) {
-			u32 *report32 = (void *)(stream->oa_buffer.vaddr + tail);
+			void *report = stream->oa_buffer.vaddr + tail;
 
-			if (report32[0] != 0 || report32[1] != 0)
+			if (oa_report_id(stream, report) ||
+			    oa_timestamp(stream, report))
 				break;
 
 			tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
@@ -702,7 +772,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 		u8 *report = oa_buf_base + head;
 		u32 *report32 = (void *)report;
 		u32 ctx_id;
-		u32 reason;
+		u64 reason;
 
 		/*
 		 * All the report sizes factor neatly into the buffer
@@ -725,16 +795,12 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 		 * triggered this specific report (mostly timer
 		 * triggered or e.g. due to a context switch).
 		 *
-		 * This field is never expected to be zero so we can
-		 * check that the report isn't invalid before copying
-		 * it to userspace...
+		 * In MMIO triggered reports, some platforms do not set the
+		 * reason bit in this field and it is valid to have a reason
+		 * field of zero.
 		 */
-		reason = ((report32[0] >> OAREPORT_REASON_SHIFT) &
-			  (GRAPHICS_VER(stream->perf->i915) == 12 ?
-			   OAREPORT_REASON_MASK_EXTENDED :
-			   OAREPORT_REASON_MASK));
-
-		ctx_id = report32[2] & stream->specific_ctx_id_mask;
+		reason = oa_report_reason(stream, report);
+		ctx_id = oa_context_id(stream, report32);
 
 		/*
 		 * Squash whatever is in the CTX_ID field if it's marked as
@@ -744,9 +810,10 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 		 * Note: that we don't clear the valid_ctx_bit so userspace can
 		 * understand that the ID has been squashed by the kernel.
 		 */
-		if (!(report32[0] & stream->perf->gen8_valid_ctx_bit) &&
-		    GRAPHICS_VER(stream->perf->i915) <= 11)
-			ctx_id = report32[2] = INVALID_CTX_ID;
+		if (oa_report_ctx_invalid(stream, report)) {
+			ctx_id = INVALID_CTX_ID;
+			oa_context_id_squash(stream, report32);
+		}
 
 		/*
 		 * NB: For Gen 8 the OA unit no longer supports clock gating
@@ -790,7 +857,7 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 			 */
 			if (stream->ctx &&
 			    stream->specific_ctx_id != ctx_id) {
-				report32[2] = INVALID_CTX_ID;
+				oa_context_id_squash(stream, report32);
 			}
 
 			ret = append_oa_sample(stream, buf, count, offset,
@@ -802,11 +869,11 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 		}
 
 		/*
-		 * Clear out the first 2 dword as a mean to detect unlanded
+		 * Clear out the report id and timestamp as a means to detect unlanded
 		 * reports.
 		 */
-		report32[0] = 0;
-		report32[1] = 0;
+		oa_report_id_clear(stream, report32);
+		oa_timestamp_clear(stream, report32);
 	}
 
 	if (start_offset != *offset) {
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
index ce99551ad0fd..8ccb0b89d019 100644
--- a/drivers/gpu/drm/i915/i915_perf_types.h
+++ b/drivers/gpu/drm/i915/i915_perf_types.h
@@ -38,9 +38,15 @@ enum {
 	PERF_GROUP_INVALID = U32_MAX,
 };
 
+enum report_header {
+	HDR_32_BIT = 0,
+	HDR_64_BIT,
+};
+
 struct i915_oa_format {
 	u32 format;
 	int size;
+	enum report_header header;
 };
 
 struct i915_oa_reg {
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (5 preceding siblings ...)
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 6/9] drm/i915/perf: Parse 64bit report header formats correctly Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-17 20:58   ` Dixit, Ashutosh
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf Umesh Nerlige Ramappa
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

Some of the newer OA formats are not powers of 2. For those formats,
adjust the hw_tail accordingly when checking for new reports.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 50 ++++++++++++++++++--------------
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 9715b964aa1e..d3a1892c93be 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -542,6 +542,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 	bool pollin;
 	u32 hw_tail;
 	u64 now;
+	u32 partial_report_size;
 
 	/* We have to consider the (unlikely) possibility that read() errors
 	 * could result in an OA buffer reset which might reset the head and
@@ -551,10 +552,16 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
 
 	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
 
-	/* The tail pointer increases in 64 byte increments,
-	 * not in report_size steps...
+	/* The tail pointer increases in 64 byte increments, whereas report
+	 * sizes need not be integral multiples or 64 or powers of 2.
+	 * Compute potentially partially landed report in the OA buffer
 	 */
-	hw_tail &= ~(report_size - 1);
+	partial_report_size = OA_TAKEN(hw_tail, stream->oa_buffer.tail);
+	partial_report_size %= report_size;
+
+	/* Subtract partial amount off the tail */
+	hw_tail = gtt_offset + ((hw_tail - partial_report_size) &
+				(stream->oa_buffer.vma->size - 1));
 
 	now = ktime_get_mono_fast_ns();
 
@@ -677,6 +684,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
 {
 	int report_size = stream->oa_buffer.format->size;
 	struct drm_i915_perf_record_header header;
+	int report_size_partial;
+	u8 *oa_buf_end;
 
 	header.type = DRM_I915_PERF_RECORD_SAMPLE;
 	header.pad = 0;
@@ -690,8 +699,21 @@ static int append_oa_sample(struct i915_perf_stream *stream,
 		return -EFAULT;
 	buf += sizeof(header);
 
-	if (copy_to_user(buf, report, report_size))
+	oa_buf_end = stream->oa_buffer.vaddr +
+		     stream->oa_buffer.vma->size;
+	report_size_partial = oa_buf_end - report;
+
+	if (report_size_partial < report_size) {
+		if (copy_to_user(buf, report, report_size_partial))
+			return -EFAULT;
+		buf += report_size_partial;
+
+		if (copy_to_user(buf, stream->oa_buffer.vaddr,
+				 report_size - report_size_partial))
+			return -EFAULT;
+	} else if (copy_to_user(buf, report, report_size)) {
 		return -EFAULT;
+	}
 
 	(*offset) += header.size;
 
@@ -759,8 +781,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 	 * all a power of two).
 	 */
 	if (drm_WARN_ONCE(&uncore->i915->drm,
-			  head > OA_BUFFER_SIZE || head % report_size ||
-			  tail > OA_BUFFER_SIZE || tail % report_size,
+			  head > OA_BUFFER_SIZE ||
+			  tail > OA_BUFFER_SIZE,
 			  "Inconsistent OA buffer pointers: head = %u, tail = %u\n",
 			  head, tail))
 		return -EIO;
@@ -774,22 +796,6 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 		u32 ctx_id;
 		u64 reason;
 
-		/*
-		 * All the report sizes factor neatly into the buffer
-		 * size so we never expect to see a report split
-		 * between the beginning and end of the buffer.
-		 *
-		 * Given the initial alignment check a misalignment
-		 * here would imply a driver bug that would result
-		 * in an overrun.
-		 */
-		if (drm_WARN_ON(&uncore->i915->drm,
-				(OA_BUFFER_SIZE - head) < report_size)) {
-			drm_err(&uncore->i915->drm,
-				"Spurious OA head ptr: non-integral report offset\n");
-			break;
-		}
-
 		/*
 		 * The reason field includes flags identifying what
 		 * triggered this specific report (mostly timer
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (6 preceding siblings ...)
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-17 23:37   ` Umesh Nerlige Ramappa
  2023-02-21 23:53   ` Dixit, Ashutosh
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units Umesh Nerlige Ramappa
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

Current implementation of perf defaults to render and configures the
default OAG unit. Since there are more OA units on newer hardware, allow
user to pass engine class and instance to program specific OA units.

UMD specific changes for GPUvis support:
https://patchwork.freedesktop.org/patch/522827/?series=114023
https://patchwork.freedesktop.org/patch/522822/?series=114023
https://patchwork.freedesktop.org/patch/522826/?series=114023
https://patchwork.freedesktop.org/patch/522828/?series=114023
https://patchwork.freedesktop.org/patch/522816/?series=114023
https://patchwork.freedesktop.org/patch/522825/?series=114023

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 49 +++++++++++++++++++-------------
 include/uapi/drm/i915_drm.h      | 20 +++++++++++++
 2 files changed, 49 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index d3a1892c93be..f028df812067 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -4035,40 +4035,29 @@ static int read_properties_unlocked(struct i915_perf *perf,
 	struct drm_i915_gem_context_param_sseu user_sseu;
 	u64 __user *uprop = uprops;
 	bool config_sseu = false;
+	u8 class, instance;
 	u32 i;
 	int ret;
 
 	memset(props, 0, sizeof(struct perf_open_properties));
 	props->poll_oa_period = DEFAULT_POLL_PERIOD_NS;
 
-	if (!n_props) {
-		drm_dbg(&perf->i915->drm,
-			"No i915 perf properties given\n");
-		return -EINVAL;
-	}
-
-	/* At the moment we only support using i915-perf on the RCS. */
-	props->engine = intel_engine_lookup_user(perf->i915,
-						 I915_ENGINE_CLASS_RENDER,
-						 0);
-	if (!props->engine) {
-		drm_dbg(&perf->i915->drm,
-			"No RENDER-capable engines\n");
-		return -EINVAL;
-	}
-
 	/* Considering that ID = 0 is reserved and assuming that we don't
 	 * (currently) expect any configurations to ever specify duplicate
 	 * values for a particular property ID then the last _PROP_MAX value is
 	 * one greater than the maximum number of properties we expect to get
 	 * from userspace.
 	 */
-	if (n_props >= DRM_I915_PERF_PROP_MAX) {
+	if (!n_props || n_props >= DRM_I915_PERF_PROP_MAX) {
 		drm_dbg(&perf->i915->drm,
-			"More i915 perf properties specified than exist\n");
+			"Invalid no. of i915 perf properties given\n");
 		return -EINVAL;
 	}
 
+	/* Defaults when class:instance is not passed */
+	class = I915_ENGINE_CLASS_RENDER;
+	instance = 0;
+
 	for (i = 0; i < n_props; i++) {
 		u64 oa_period, oa_freq_hz;
 		u64 id, value;
@@ -4189,7 +4178,13 @@ static int read_properties_unlocked(struct i915_perf *perf,
 			}
 			props->poll_oa_period = value;
 			break;
-		case DRM_I915_PERF_PROP_MAX:
+		case DRM_I915_PERF_PROP_OA_ENGINE_CLASS:
+			class = (u8)value;
+			break;
+		case DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE:
+			instance = (u8)value;
+			break;
+		default:
 			MISSING_CASE(id);
 			return -EINVAL;
 		}
@@ -4197,6 +4192,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
 		uprop += 2;
 	}
 
+	props->engine = intel_engine_lookup_user(perf->i915, class, instance);
+	if (!props->engine) {
+		drm_dbg(&perf->i915->drm,
+			"OA engine class and instance invalid %d:%d\n",
+			class, instance);
+		return -EINVAL;
+	}
+
+	if (!engine_supports_oa(props->engine))
+		return -EINVAL;
+
 	if (config_sseu) {
 		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
 		if (ret) {
@@ -5208,8 +5214,11 @@ int i915_perf_ioctl_version(void)
 	 *
 	 * 5: Add DRM_I915_PERF_PROP_POLL_OA_PERIOD parameter that controls the
 	 *    interval for the hrtimer used to check for OA data.
+	 *
+	 * 6: Add DRM_I915_PERF_PROP_OA_ENGINE_CLASS and
+	 *    DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE
 	 */
-	return 5;
+	return 6;
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8df261c5ab9b..b6922b52d85c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2758,6 +2758,26 @@ enum drm_i915_perf_property_id {
 	 */
 	DRM_I915_PERF_PROP_POLL_OA_PERIOD,
 
+	/**
+	 * In platforms with multiple OA buffers, the engine class instance must
+	 * be passed to open a stream to a OA unit corresponding to the engine.
+	 * Multiple engines may be mapped to the same OA unit.
+	 *
+	 * In addition to the class:instance, if a gem context is also passed, then
+	 * 1) the report headers of OA reports from other engines are squashed.
+	 * 2) OAR is enabled for the class:instance
+	 *
+	 * This property is available in perf revision 6.
+	 */
+	DRM_I915_PERF_PROP_OA_ENGINE_CLASS,
+
+	/**
+	 * This parameter specifies the engine instance.
+	 *
+	 * This property is available in perf revision 6.
+	 */
+	DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE,
+
 	DRM_I915_PERF_PROP_MAX /* non-ABI */
 };
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (7 preceding siblings ...)
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf Umesh Nerlige Ramappa
@ 2023-02-17  0:58 ` Umesh Nerlige Ramappa
  2023-02-17 23:37   ` Umesh Nerlige Ramappa
  2023-02-23 20:05   ` Dixit, Ashutosh
  2023-02-17  1:35 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Add OAM support for MTL (rev2) Patchwork
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17  0:58 UTC (permalink / raw)
  To: intel-gfx

MTL introduces additional OA units dedicated to media use cases. Add
support for programming these OA units by passing the media engine class
and instance parameters.

UMD specific changes for GPUvis support:
https://patchwork.freedesktop.org/patch/522827/?series=114023
https://patchwork.freedesktop.org/patch/522822/?series=114023
https://patchwork.freedesktop.org/patch/522826/?series=114023
https://patchwork.freedesktop.org/patch/522828/?series=114023
https://patchwork.freedesktop.org/patch/522816/?series=114023
https://patchwork.freedesktop.org/patch/522825/?series=114023

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |   2 +
 drivers/gpu/drm/i915/i915_pci.c          |   1 +
 drivers/gpu/drm/i915/i915_perf.c         | 247 ++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_perf_oa_regs.h |  78 +++++++
 drivers/gpu/drm/i915/i915_perf_types.h   |  40 ++++
 drivers/gpu/drm/i915/intel_device_info.h |   1 +
 include/uapi/drm/i915_drm.h              |   4 +
 7 files changed, 347 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0393273faa09..f3cacbf41c86 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -856,6 +856,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 	(INTEL_INFO(dev_priv)->has_oa_bpc_reporting)
 #define HAS_OA_SLICE_CONTRIB_LIMITS(dev_priv) \
 	(INTEL_INFO(dev_priv)->has_oa_slice_contrib_limits)
+#define HAS_OAM(dev_priv) \
+	(INTEL_INFO(dev_priv)->has_oam)
 
 /*
  * Set this flag, when platform requires 64K GTT page sizes or larger for
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index a8d942b16223..621730b6551c 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1028,6 +1028,7 @@ static const struct intel_device_info adl_p_info = {
 	.has_mslice_steering = 1, \
 	.has_oa_bpc_reporting = 1, \
 	.has_oa_slice_contrib_limits = 1, \
+	.has_oam = 1, \
 	.has_rc6 = 1, \
 	.has_reset_engine = 1, \
 	.has_rps = 1, \
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index f028df812067..a57690f4c531 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -192,6 +192,7 @@
  */
 
 #include <linux/anon_inodes.h>
+#include <linux/nospec.h>
 #include <linux/sizes.h>
 #include <linux/uuid.h>
 
@@ -326,6 +327,13 @@ static const struct i915_oa_format oa_formats[I915_OA_FORMAT_MAX] = {
 	[I915_OA_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 },
 	[I915_OAR_FORMAT_A32u40_A4u32_B8_C8]    = { 5, 256 },
 	[I915_OA_FORMAT_A24u40_A14u32_B8_C8]    = { 5, 256 },
+	[I915_OAM_FORMAT_MPEC8u64_B8_C8]	= { 1, 192, TYPE_OAM, HDR_64_BIT },
+	[I915_OAM_FORMAT_MPEC8u32_B8_C8]	= { 2, 128, TYPE_OAM, HDR_64_BIT },
+};
+
+/* PERF_GROUP_OAG is unused for oa_base, drop it for mtl */
+static const u32 mtl_oa_base[] = {
+	[PERF_GROUP_OAM_SAMEDIA_0] = 0x393000,
 };
 
 #define SAMPLE_OA_REPORT      (1<<0)
@@ -418,11 +426,17 @@ static void free_oa_config_bo(struct i915_oa_config_bo *oa_bo)
 	kfree(oa_bo);
 }
 
+static inline const
+struct i915_perf_regs *__oa_regs(struct i915_perf_stream *stream)
+{
+	return &stream->oa_buffer.group->regs;
+}
+
 static u32 gen12_oa_hw_tail_read(struct i915_perf_stream *stream)
 {
 	struct intel_uncore *uncore = stream->uncore;
 
-	return intel_uncore_read(uncore, GEN12_OAG_OATAILPTR) &
+	return intel_uncore_read(uncore, __oa_regs(stream)->oa_tail_ptr) &
 	       GEN12_OAG_OATAILPTR_MASK;
 }
 
@@ -886,7 +900,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 		i915_reg_t oaheadptr;
 
 		oaheadptr = GRAPHICS_VER(stream->perf->i915) == 12 ?
-			    GEN12_OAG_OAHEADPTR : GEN8_OAHEADPTR;
+			    __oa_regs(stream)->oa_head_ptr :
+			    GEN8_OAHEADPTR;
 
 		spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
 
@@ -939,7 +954,8 @@ static int gen8_oa_read(struct i915_perf_stream *stream,
 		return -EIO;
 
 	oastatus_reg = GRAPHICS_VER(stream->perf->i915) == 12 ?
-		       GEN12_OAG_OASTATUS : GEN8_OASTATUS;
+		       __oa_regs(stream)->oa_status :
+		       GEN8_OASTATUS;
 
 	oastatus = intel_uncore_read(uncore, oastatus_reg);
 
@@ -1643,16 +1659,46 @@ free_noa_wait(struct i915_perf_stream *stream)
 	i915_vma_unpin_and_release(&stream->noa_wait, 0);
 }
 
+/*
+ * intel_engine_lookup_user ensures that most of engine specific checks are
+ * taken care of, however, we can run into a case where the OA unit catering to
+ * the engine passed by the user is disabled for some reason. In such cases,
+ * ensure oa unit corresponding to an engine is functional. If there are no
+ * engines in the group, the unit is disabled.
+ */
+static bool oa_unit_functional(const struct intel_engine_cs *engine)
+{
+	return engine->oa_group && engine->oa_group->num_engines;
+}
+
 static bool engine_supports_oa(const struct intel_engine_cs *engine)
 {
 	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
 
 	switch (platform) {
+	case INTEL_METEORLAKE:
+		return engine->class == RENDER_CLASS ||
+		       ((engine->class == VIDEO_DECODE_CLASS ||
+			 engine->class == VIDEO_ENHANCEMENT_CLASS) &&
+			engine->gt->type == GT_MEDIA);
 	default:
 		return engine->class == RENDER_CLASS;
 	}
 }
 
+static bool engine_class_supports_oa_format(struct intel_engine_cs *engine, int type)
+{
+	switch (engine->class) {
+	case RENDER_CLASS:
+		return type == TYPE_OAG;
+	case VIDEO_DECODE_CLASS:
+	case VIDEO_ENHANCEMENT_CLASS:
+		return type == TYPE_OAM;
+	default:
+		return false;
+	}
+}
+
 static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 {
 	struct i915_perf *perf = stream->perf;
@@ -1680,7 +1726,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 		drm_WARN_ON(&gt->i915->drm,
 			    intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc));
 
-	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
+	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
 	intel_engine_pm_put(stream->engine);
 
 	if (stream->ctx)
@@ -1804,8 +1850,8 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
 
 	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
 
-	intel_uncore_write(uncore, GEN12_OAG_OASTATUS, 0);
-	intel_uncore_write(uncore, GEN12_OAG_OAHEADPTR,
+	intel_uncore_write(uncore, __oa_regs(stream)->oa_status, 0);
+	intel_uncore_write(uncore, __oa_regs(stream)->oa_head_ptr,
 			   gtt_offset & GEN12_OAG_OAHEADPTR_MASK);
 	stream->oa_buffer.head = gtt_offset;
 
@@ -1817,9 +1863,9 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
 	 *  to enable proper functionality of the overflow
 	 *  bit."
 	 */
-	intel_uncore_write(uncore, GEN12_OAG_OABUFFER, gtt_offset |
+	intel_uncore_write(uncore, __oa_regs(stream)->oa_buffer, gtt_offset |
 			   OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT);
-	intel_uncore_write(uncore, GEN12_OAG_OATAILPTR,
+	intel_uncore_write(uncore, __oa_regs(stream)->oa_tail_ptr,
 			   gtt_offset & GEN12_OAG_OATAILPTR_MASK);
 
 	/* Mark that we need updated tail pointers to read from... */
@@ -2579,7 +2625,8 @@ gen8_modify_self(struct intel_context *ce,
 	return err;
 }
 
-static int gen8_configure_context(struct i915_gem_context *ctx,
+static int gen8_configure_context(struct i915_perf_stream *stream,
+				  struct i915_gem_context *ctx,
 				  struct flex *flex, unsigned int count)
 {
 	struct i915_gem_engines_iter it;
@@ -2589,7 +2636,8 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
 	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
 		GEM_BUG_ON(ce == ce->engine->kernel_context);
 
-		if (!engine_supports_oa(ce->engine))
+		if (!engine_supports_oa(ce->engine) ||
+		    ce->engine->class != stream->engine->class)
 			continue;
 
 		/* Otherwise OA settings will be set upon first use */
@@ -2720,7 +2768,7 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
 
 		spin_unlock(&i915->gem.contexts.lock);
 
-		err = gen8_configure_context(ctx, regs, num_regs);
+		err = gen8_configure_context(stream, ctx, regs, num_regs);
 		if (err) {
 			i915_gem_context_put(ctx);
 			return err;
@@ -2740,7 +2788,8 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
 	for_each_uabi_engine(engine, i915) {
 		struct intel_context *ce = engine->kernel_context;
 
-		if (!engine_supports_oa(ce->engine))
+		if (!engine_supports_oa(ce->engine) ||
+		    ce->engine->class != stream->engine->class)
 			continue;
 
 		regs[0].value = intel_sseu_make_rpcs(engine->gt, &ce->sseu);
@@ -2765,6 +2814,9 @@ gen12_configure_all_contexts(struct i915_perf_stream *stream,
 		},
 	};
 
+	if (stream->engine->class != RENDER_CLASS)
+		return 0;
+
 	return oa_configure_all_contexts(stream,
 					 regs, ARRAY_SIZE(regs),
 					 active);
@@ -2894,7 +2946,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
 				   _MASKED_BIT_ENABLE(GEN12_DISABLE_DOP_GATING));
 	}
 
-	intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG,
+	intel_uncore_write(uncore, __oa_regs(stream)->oa_debug,
 			   /* Disable clk ratio reports, like previous Gens. */
 			   _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS |
 					      GEN12_OAG_OA_DEBUG_INCLUDE_CLK_RATIO) |
@@ -2904,7 +2956,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
 			    */
 			   oag_report_ctx_switches(stream));
 
-	intel_uncore_write(uncore, GEN12_OAG_OAGLBCTXCTRL, periodic ?
+	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctx_ctrl, periodic ?
 			   (GEN12_OAG_OAGLBCTXCTRL_COUNTER_RESUME |
 			    GEN12_OAG_OAGLBCTXCTRL_TIMER_ENABLE |
 			    (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT))
@@ -3058,8 +3110,8 @@ static void gen8_oa_enable(struct i915_perf_stream *stream)
 
 static void gen12_oa_enable(struct i915_perf_stream *stream)
 {
-	struct intel_uncore *uncore = stream->uncore;
-	u32 report_format = stream->oa_buffer.format->format;
+	const struct i915_perf_regs *regs;
+	u32 val;
 
 	/*
 	 * If we don't want OA reports from the OA buffer, then we don't even
@@ -3070,9 +3122,11 @@ static void gen12_oa_enable(struct i915_perf_stream *stream)
 
 	gen12_init_oa_buffer(stream);
 
-	intel_uncore_write(uncore, GEN12_OAG_OACONTROL,
-			   (report_format << GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT) |
-			   GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE);
+	regs = __oa_regs(stream);
+	val = (stream->oa_buffer.format->format << regs->oa_ctrl_counter_format_shift) |
+	      GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE;
+
+	intel_uncore_write(stream->uncore, regs->oa_ctrl, val);
 }
 
 /**
@@ -3124,9 +3178,9 @@ static void gen12_oa_disable(struct i915_perf_stream *stream)
 {
 	struct intel_uncore *uncore = stream->uncore;
 
-	intel_uncore_write(uncore, GEN12_OAG_OACONTROL, 0);
+	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctrl, 0);
 	if (intel_wait_for_register(uncore,
-				    GEN12_OAG_OACONTROL,
+				    __oa_regs(stream)->oa_ctrl,
 				    GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE, 0,
 				    50))
 		drm_err(&stream->perf->i915->drm,
@@ -3329,6 +3383,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 
 	stream->sample_size = sizeof(struct drm_i915_perf_record_header);
 
+	stream->oa_buffer.group = g;
 	stream->oa_buffer.format = &perf->oa_formats[props->oa_format];
 	if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format->size == 0))
 		return -EINVAL;
@@ -3379,7 +3434,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	 *   references will effectively disable RC6.
 	 */
 	intel_engine_pm_get(stream->engine);
-	intel_uncore_forcewake_get(stream->uncore, FORCEWAKE_ALL);
+	intel_uncore_forcewake_get(stream->uncore, g->fw_domains);
 
 	/*
 	 * Wa_16011777198:dg2: GuC resets render as part of the Wa. This causes
@@ -3440,7 +3495,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 		intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc);
 
 err_gucrc:
-	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
+	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
 	intel_engine_pm_put(stream->engine);
 
 	free_oa_configs(stream);
@@ -4033,6 +4088,7 @@ static int read_properties_unlocked(struct i915_perf *perf,
 				    struct perf_open_properties *props)
 {
 	struct drm_i915_gem_context_param_sseu user_sseu;
+	const struct i915_oa_format *f;
 	u64 __user *uprop = uprops;
 	bool config_sseu = false;
 	u8 class, instance;
@@ -4203,6 +4259,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
 	if (!engine_supports_oa(props->engine))
 		return -EINVAL;
 
+	if (!oa_unit_functional(props->engine))
+		return -ENODEV;
+
+	i = array_index_nospec(props->oa_format, I915_OA_FORMAT_MAX);
+	f = &perf->oa_formats[i];
+	if (!engine_class_supports_oa_format(props->engine, f->type)) {
+		DRM_DEBUG("Invalid OA format %d for class %d\n",
+			  f->type, props->engine->class);
+		return -EINVAL;
+	}
+
 	if (config_sseu) {
 		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
 		if (ret) {
@@ -4383,6 +4450,14 @@ static const struct i915_range gen12_oa_b_counters[] = {
 	{}
 };
 
+static const struct i915_range mtl_oam_b_counters[] = {
+	{ .start = 0x393000, .end = 0x39301c },	/* GEN12_OAM_STARTTRIG1[1-8] */
+	{ .start = 0x393020, .end = 0x39303c },	/* GEN12_OAM_REPORTTRIG1[1-8] */
+	{ .start = 0x393040, .end = 0x39307c },	/* GEN12_OAM_CEC[0-7][0-1] */
+	{ .start = 0x393200, .end = 0x39323C },	/* MPES[0-7] */
+	{}
+};
+
 static const struct i915_range xehp_oa_b_counters[] = {
 	{ .start = 0xdc48, .end = 0xdc48 },	/* OAA_ENABLE_REG */
 	{ .start = 0xdd00, .end = 0xdd48 },	/* OAG_LCE0_0 - OAA_LENABLE_REG */
@@ -4429,13 +4504,16 @@ static const struct i915_range gen12_oa_mux_regs[] = {
 
 /*
  * Ref: 14010536224:
- * 0x20cc is repurposed on MTL, so use a separate array for MTL.
+ * 0x20cc is repurposed on MTL, so use a separate array for MTL. Also add the
+ * MPES/MPEC registers.
  */
 static const struct i915_range mtl_oa_mux_regs[] = {
 	{ .start = 0x0d00, .end = 0x0d04 },	/* RPM_CONFIG[0-1] */
 	{ .start = 0x0d0c, .end = 0x0d2c },	/* NOA_CONFIG[0-8] */
 	{ .start = 0x9840, .end = 0x9840 },	/* GDT_CHICKEN_BITS */
 	{ .start = 0x9884, .end = 0x9888 },	/* NOA_WRITE */
+	{ .start = 0x38d100, .end = 0x38d114},	/* VISACTL */
+	{}
 };
 
 static bool gen7_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
@@ -4473,10 +4551,26 @@ static bool gen12_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
 	return reg_in_range_table(addr, gen12_oa_b_counters);
 }
 
+static bool xehp_is_valid_oam_b_counter_addr(struct i915_perf *perf, u32 addr)
+{
+	enum intel_platform platform = INTEL_INFO(perf->i915)->platform;
+
+	if (!HAS_OAM(perf->i915))
+		return false;
+
+	switch (platform) {
+	case INTEL_METEORLAKE:
+		return reg_in_range_table(addr, mtl_oam_b_counters);
+	default:
+		return false;
+	}
+}
+
 static bool xehp_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
 {
 	return reg_in_range_table(addr, xehp_oa_b_counters) ||
-		reg_in_range_table(addr, gen12_oa_b_counters);
+		reg_in_range_table(addr, gen12_oa_b_counters) ||
+		xehp_is_valid_oam_b_counter_addr(perf, addr);
 }
 
 static bool gen12_is_valid_mux_addr(struct i915_perf *perf, u32 addr)
@@ -4846,11 +4940,39 @@ static u32 __num_perf_groups_per_gt(struct intel_gt *gt)
 	enum intel_platform platform = INTEL_INFO(gt->i915)->platform;
 
 	switch (platform) {
+	case INTEL_METEORLAKE:
+		return 1;
 	default:
 		return 1;
 	}
 }
 
+static u32 __oam_engine_group(struct intel_engine_cs *engine)
+{
+	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
+	struct intel_gt *gt = engine->gt;
+	u32 group = PERF_GROUP_INVALID;
+
+	switch (platform) {
+	case INTEL_METEORLAKE:
+		/*
+		 * There's 1 SAMEDIA gt and 1 OAM per SAMEDIA gt. All media slices
+		 * within the gt use the same OAM. All MTL SKUs list 1 SA MEDIA.
+		 */
+		drm_WARN_ON(&engine->i915->drm,
+			    engine->gt->type != GT_MEDIA);
+
+		group = PERF_GROUP_OAM_SAMEDIA_0;
+		break;
+	default:
+		break;
+	}
+
+	drm_WARN_ON(&gt->i915->drm, group >= __num_perf_groups_per_gt(gt));
+
+	return group;
+}
+
 static u32 __oa_engine_group(struct intel_engine_cs *engine)
 {
 	if (!engine_supports_oa(engine))
@@ -4860,11 +4982,58 @@ static u32 __oa_engine_group(struct intel_engine_cs *engine)
 	case RENDER_CLASS:
 		return PERF_GROUP_OAG;
 
+	case VIDEO_DECODE_CLASS:
+	case VIDEO_ENHANCEMENT_CLASS:
+		return __oam_engine_group(engine);
+
 	default:
 		return PERF_GROUP_INVALID;
 	}
 }
 
+static struct i915_perf_regs __oam_regs(u32 base)
+{
+	return (struct i915_perf_regs) {
+		base,
+		GEN12_OAM_HEAD_POINTER(base),
+		GEN12_OAM_TAIL_POINTER(base),
+		GEN12_OAM_BUFFER(base),
+		GEN12_OAM_CONTEXT_CONTROL(base),
+		GEN12_OAM_CONTROL(base),
+		GEN12_OAM_DEBUG(base),
+		GEN12_OAM_STATUS(base),
+		GEN12_OAM_CONTROL_COUNTER_FORMAT_SHIFT,
+	};
+}
+
+static struct i915_perf_regs __oag_regs(void)
+{
+	return (struct i915_perf_regs) {
+		0,
+		GEN12_OAG_OAHEADPTR,
+		GEN12_OAG_OATAILPTR,
+		GEN12_OAG_OABUFFER,
+		GEN12_OAG_OAGLBCTXCTRL,
+		GEN12_OAG_OACONTROL,
+		GEN12_OAG_OA_DEBUG,
+		GEN12_OAG_OASTATUS,
+		GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT,
+	};
+}
+
+static void oa_init_regs(struct intel_gt *gt, u32 id)
+{
+	struct i915_perf_group *group = &gt->perf.group[id];
+	struct i915_perf_regs *regs = &group->regs;
+
+	if (id == PERF_GROUP_OAG && gt->type != GT_MEDIA)
+		*regs = __oag_regs();
+	else if (IS_METEORLAKE(gt->i915))
+		*regs = __oam_regs(mtl_oa_base[id]);
+	else
+		drm_WARN(&gt->i915->drm, 1, "Unsupported platform for OA\n");
+}
+
 static void oa_init_groups(struct intel_gt *gt)
 {
 	int i, num_groups = gt->perf.num_perf_groups;
@@ -4881,6 +5050,24 @@ static void oa_init_groups(struct intel_gt *gt)
 		g->oa_unit_id = perf->oa_unit_ids++;
 
 		g->gt = gt;
+		oa_init_regs(gt, i);
+		g->fw_domains = FORCEWAKE_ALL;
+		if (i == PERF_GROUP_OAG) {
+			g->type = TYPE_OAG;
+
+			/*
+			 * Enabling all fw domains for OAG caps the max GT
+			 * frequency to media FF max. This could be less than
+			 * what the user sets through the sysfs and perf
+			 * measurements could be skewed. Since some platforms
+			 * have separate OAM units to measure media perf, do not
+			 * enable media fw domains for OAG.
+			 */
+			if (HAS_OAM(gt->i915))
+				g->fw_domains = FORCEWAKE_GT | FORCEWAKE_RENDER;
+		} else {
+			g->type = TYPE_OAM;
+		}
 	}
 }
 
@@ -4970,9 +5157,15 @@ static void oa_init_supported_formats(struct i915_perf *perf)
 		break;
 
 	case INTEL_DG2:
+		oa_format_add(perf, I915_OAR_FORMAT_A32u40_A4u32_B8_C8);
+		oa_format_add(perf, I915_OA_FORMAT_A24u40_A14u32_B8_C8);
+		break;
+
 	case INTEL_METEORLAKE:
 		oa_format_add(perf, I915_OAR_FORMAT_A32u40_A4u32_B8_C8);
 		oa_format_add(perf, I915_OA_FORMAT_A24u40_A14u32_B8_C8);
+		oa_format_add(perf, I915_OAM_FORMAT_MPEC8u64_B8_C8);
+		oa_format_add(perf, I915_OAM_FORMAT_MPEC8u32_B8_C8);
 		break;
 
 	default:
@@ -5217,8 +5410,10 @@ int i915_perf_ioctl_version(void)
 	 *
 	 * 6: Add DRM_I915_PERF_PROP_OA_ENGINE_CLASS and
 	 *    DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE
+	 *
+	 * 7: Add support for video decode and enhancement classes.
 	 */
-	return 6;
+	return 7;
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/i915_perf_oa_regs.h b/drivers/gpu/drm/i915/i915_perf_oa_regs.h
index 381d94101610..ba103875e19f 100644
--- a/drivers/gpu/drm/i915/i915_perf_oa_regs.h
+++ b/drivers/gpu/drm/i915/i915_perf_oa_regs.h
@@ -138,4 +138,82 @@
 #define   GEN12_SQCNT1_PMON_ENABLE		REG_BIT(30)
 #define   GEN12_SQCNT1_OABPC			REG_BIT(29)
 
+/* Gen12 OAM unit */
+#define GEN12_OAM_HEAD_POINTER_OFFSET   (0x1a0)
+#define  GEN12_OAM_HEAD_POINTER_MASK    0xffffffc0
+
+#define GEN12_OAM_TAIL_POINTER_OFFSET   (0x1a4)
+#define  GEN12_OAM_TAIL_POINTER_MASK    0xffffffc0
+
+#define GEN12_OAM_BUFFER_OFFSET         (0x1a8)
+#define  GEN12_OAM_BUFFER_SIZE_MASK     (0x7)
+#define  GEN12_OAM_BUFFER_SIZE_SHIFT    (3)
+#define  GEN12_OAM_BUFFER_MEMORY_SELECT REG_BIT(0) /* 0: PPGTT, 1: GGTT */
+
+#define GEN12_OAM_CONTEXT_CONTROL_OFFSET              (0x1bc)
+#define  GEN12_OAM_CONTEXT_CONTROL_TIMER_PERIOD_SHIFT 2
+#define  GEN12_OAM_CONTEXT_CONTROL_TIMER_ENABLE       REG_BIT(1)
+#define  GEN12_OAM_CONTEXT_CONTROL_COUNTER_RESUME     REG_BIT(0)
+
+#define GEN12_OAM_CONTROL_OFFSET                (0x194)
+#define  GEN12_OAM_CONTROL_COUNTER_FORMAT_SHIFT 1
+#define  GEN12_OAM_CONTROL_COUNTER_ENABLE       REG_BIT(0)
+
+#define GEN12_OAM_DEBUG_OFFSET                      (0x198)
+#define  GEN12_OAM_DEBUG_BUFFER_SIZE_SELECT         REG_BIT(12)
+#define  GEN12_OAM_DEBUG_INCLUDE_CLK_RATIO          REG_BIT(6)
+#define  GEN12_OAM_DEBUG_DISABLE_CLK_RATIO_REPORTS  REG_BIT(5)
+#define  GEN12_OAM_DEBUG_DISABLE_GO_1_0_REPORTS     REG_BIT(2)
+#define  GEN12_OAM_DEBUG_DISABLE_CTX_SWITCH_REPORTS REG_BIT(1)
+
+#define GEN12_OAM_STATUS_OFFSET            (0x19c)
+#define  GEN12_OAM_STATUS_COUNTER_OVERFLOW REG_BIT(2)
+#define  GEN12_OAM_STATUS_BUFFER_OVERFLOW  REG_BIT(1)
+#define  GEN12_OAM_STATUS_REPORT_LOST      REG_BIT(0)
+
+#define GEN12_OAM_MMIO_TRG_OFFSET	(0x1d0)
+
+#define GEN12_OAM_MMIO_TRG(base) \
+	_MMIO((base) + GEN12_OAM_MMIO_TRG_OFFSET)
+
+#define GEN12_OAM_HEAD_POINTER(base) \
+	_MMIO((base) + GEN12_OAM_HEAD_POINTER_OFFSET)
+#define GEN12_OAM_TAIL_POINTER(base) \
+	_MMIO((base) + GEN12_OAM_TAIL_POINTER_OFFSET)
+#define GEN12_OAM_BUFFER(base) \
+	_MMIO((base) + GEN12_OAM_BUFFER_OFFSET)
+#define GEN12_OAM_CONTEXT_CONTROL(base) \
+	_MMIO((base) + GEN12_OAM_CONTEXT_CONTROL_OFFSET)
+#define GEN12_OAM_CONTROL(base) \
+	_MMIO((base) + GEN12_OAM_CONTROL_OFFSET)
+#define GEN12_OAM_DEBUG(base) \
+	_MMIO((base) + GEN12_OAM_DEBUG_OFFSET)
+#define GEN12_OAM_STATUS(base) \
+	_MMIO((base) + GEN12_OAM_STATUS_OFFSET)
+
+#define GEN12_OAM_CEC0_0_OFFSET		(0x40)
+#define GEN12_OAM_CEC7_1_OFFSET		(0x7c)
+#define GEN12_OAM_CEC0_0(base) \
+	_MMIO((base) + GEN12_OAM_CEC0_0_OFFSET)
+#define GEN12_OAM_CEC7_1(base) \
+	_MMIO((base) + GEN12_OAM_CEC7_1_OFFSET)
+
+#define GEN12_OAM_STARTTRIG1_OFFSET	(0x00)
+#define GEN12_OAM_STARTTRIG8_OFFSET	(0x1c)
+#define GEN12_OAM_STARTTRIG1(base) \
+	_MMIO((base) + GEN12_OAM_STARTTRIG1_OFFSET)
+#define GEN12_OAM_STARTTRIG8(base) \
+	_MMIO((base) + GEN12_OAM_STARTTRIG8_OFFSET)
+
+#define GEN12_OAM_REPORTTRIG1_OFFSET	(0x20)
+#define GEN12_OAM_REPORTTRIG8_OFFSET	(0x3c)
+#define GEN12_OAM_REPORTTRIG1(base) \
+	_MMIO((base) + GEN12_OAM_REPORTTRIG1_OFFSET)
+#define GEN12_OAM_REPORTTRIG8(base) \
+	_MMIO((base) + GEN12_OAM_REPORTTRIG8_OFFSET)
+
+#define GEN12_OAM_PERF_COUNTER_B0_OFFSET	(0x84)
+#define GEN12_OAM_PERF_COUNTER_B(base, idx) \
+	_MMIO((base) + GEN12_OAM_PERF_COUNTER_B0_OFFSET + 4 * (idx))
+
 #endif /* __INTEL_PERF_OA_REGS__ */
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
index 8ccb0b89d019..5b2c3bab60f8 100644
--- a/drivers/gpu/drm/i915/i915_perf_types.h
+++ b/drivers/gpu/drm/i915/i915_perf_types.h
@@ -20,6 +20,7 @@
 #include "gt/intel_engine_types.h"
 #include "gt/intel_sseu.h"
 #include "i915_reg_defs.h"
+#include "intel_uncore.h"
 #include "intel_wakeref.h"
 
 struct drm_i915_private;
@@ -33,6 +34,7 @@ struct intel_engine_cs;
 
 enum {
 	PERF_GROUP_OAG = 0,
+	PERF_GROUP_OAM_SAMEDIA_0 = 0,
 
 	PERF_GROUP_MAX,
 	PERF_GROUP_INVALID = U32_MAX,
@@ -43,9 +45,27 @@ enum report_header {
 	HDR_64_BIT,
 };
 
+struct i915_perf_regs {
+	u32 base;
+	i915_reg_t oa_head_ptr;
+	i915_reg_t oa_tail_ptr;
+	i915_reg_t oa_buffer;
+	i915_reg_t oa_ctx_ctrl;
+	i915_reg_t oa_ctrl;
+	i915_reg_t oa_debug;
+	i915_reg_t oa_status;
+	u32 oa_ctrl_counter_format_shift;
+};
+
+enum {
+	TYPE_OAG,
+	TYPE_OAM,
+};
+
 struct i915_oa_format {
 	u32 format;
 	int size;
+	int type;
 	enum report_header header;
 };
 
@@ -317,6 +337,11 @@ struct i915_perf_stream {
 		 * @tail: The last verified tail that can be read by userspace.
 		 */
 		u32 tail;
+
+		/**
+		 * @group: The group object for this OA buffer.
+		 */
+		struct i915_perf_group *group;
 	} oa_buffer;
 
 	/**
@@ -431,6 +456,21 @@ struct i915_perf_group {
 	 * @engine_mask: A mask of engines using a single OA buffer.
 	 */
 	intel_engine_mask_t engine_mask;
+
+	/*
+	 * @regs: OA buffer register group for programming the OA unit.
+	 */
+	struct i915_perf_regs regs;
+
+	/*
+	 * @type: Type of OA buffer, OAM, OAG etc.
+	 */
+	int type;
+
+	/*
+	 * @fw_domains: forcewake domains required for this group.
+	 */
+	enum forcewake_domains fw_domains;
 };
 
 struct i915_perf_gt {
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 80bda653d61b..45e218327f44 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -166,6 +166,7 @@ enum intel_ppgtt_type {
 	func(has_mslice_steering); \
 	func(has_oa_bpc_reporting); \
 	func(has_oa_slice_contrib_limits); \
+	func(has_oam); \
 	func(has_one_eu_per_fuse_bit); \
 	func(has_pxp); \
 	func(has_rc6); \
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index b6922b52d85c..70bfa6530dbc 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2676,6 +2676,10 @@ enum drm_i915_oa_format {
 	I915_OAR_FORMAT_A32u40_A4u32_B8_C8,
 	I915_OA_FORMAT_A24u40_A14u32_B8_C8,
 
+	/* MTL OAM */
+	I915_OAM_FORMAT_MPEC8u64_B8_C8,
+	I915_OAM_FORMAT_MPEC8u32_B8_C8,
+
 	I915_OA_FORMAT_MAX	    /* non-ABI */
 };
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 3/9] drm/i915/perf: Validate OA sseu config outside switch
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 3/9] drm/i915/perf: Validate OA sseu config outside switch Umesh Nerlige Ramappa
@ 2023-02-17  1:10   ` Dixit, Ashutosh
  0 siblings, 0 replies; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-17  1:10 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Thu, 16 Feb 2023 16:58:44 -0800, Umesh Nerlige Ramappa wrote:
>
> Once OA supports media engine class:instance, the engine can only be
> validated outside the switch since class and instance parameters are
> separate entities. Since OA sseu config depends on engine
> class:instance, validate OA sseu config outside the switch.
>
> v2: (Ashutosh)
> - Clarify commit message
> - Use drm_dbg instead of DRM_DEBUG
> - Reorder stack variables

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>

>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_perf.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index b0e1acbe90fc..1229f65534e2 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -3950,7 +3950,9 @@ static int read_properties_unlocked(struct i915_perf *perf,
>				    u32 n_props,
>				    struct perf_open_properties *props)
>  {
> +	struct drm_i915_gem_context_param_sseu user_sseu;
>	u64 __user *uprop = uprops;
> +	bool config_sseu = false;
>	u32 i;
>	int ret;
>
> @@ -4079,8 +4081,6 @@ static int read_properties_unlocked(struct i915_perf *perf,
>			props->hold_preemption = !!value;
>			break;
>		case DRM_I915_PERF_PROP_GLOBAL_SSEU: {
> -			struct drm_i915_gem_context_param_sseu user_sseu;
> -
>			if (GRAPHICS_VER_FULL(perf->i915) >= IP_VER(12, 50)) {
>				drm_dbg(&perf->i915->drm,
>					"SSEU config not supported on gfx %x\n",
> @@ -4095,14 +4095,7 @@ static int read_properties_unlocked(struct i915_perf *perf,
>					"Unable to copy global sseu parameter\n");
>				return -EFAULT;
>			}
> -
> -			ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
> -			if (ret) {
> -				drm_dbg(&perf->i915->drm,
> -					"Invalid SSEU configuration\n");
> -				return ret;
> -			}
> -			props->has_sseu = true;
> +			config_sseu = true;
>			break;
>		}
>		case DRM_I915_PERF_PROP_POLL_OA_PERIOD:
> @@ -4122,6 +4115,16 @@ static int read_properties_unlocked(struct i915_perf *perf,
>		uprop += 2;
>	}
>
> +	if (config_sseu) {
> +		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
> +		if (ret) {
> +			drm_dbg(&perf->i915->drm,
> +				"Invalid SSEU configuration\n");
> +			return ret;
> +		}
> +		props->has_sseu = true;
> +	}
> +
>	return 0;
>  }
>
> --
> 2.36.1
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Add OAM support for MTL (rev2)
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (8 preceding siblings ...)
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units Umesh Nerlige Ramappa
@ 2023-02-17  1:35 ` Patchwork
  2023-02-17  1:55 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  2023-02-17 16:09 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  11 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2023-02-17  1:35 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

== Series Details ==

Series: Add OAM support for MTL (rev2)
URL   : https://patchwork.freedesktop.org/series/114033/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for Add OAM support for MTL (rev2)
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (9 preceding siblings ...)
  2023-02-17  1:35 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Add OAM support for MTL (rev2) Patchwork
@ 2023-02-17  1:55 ` Patchwork
  2023-02-17 16:09 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  11 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2023-02-17  1:55 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 5487 bytes --]

== Series Details ==

Series: Add OAM support for MTL (rev2)
URL   : https://patchwork.freedesktop.org/series/114033/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12754 -> Patchwork_114033v2
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/index.html

Participating hosts (40 -> 38)
------------------------------

  Missing    (2): bat-dg1-6 fi-snb-2520m 

Known issues
------------

  Here are the changes found in Patchwork_114033v2 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live@migrate:
    - bat-dg2-11:         NOTRUN -> [DMESG-FAIL][1] ([i915#7699])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/bat-dg2-11/igt@i915_selftest@live@migrate.html

  * igt@kms_chamelium_hpd@common-hpd-after-suspend:
    - bat-dg2-11:         NOTRUN -> [SKIP][2] ([i915#7828])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/bat-dg2-11/igt@kms_chamelium_hpd@common-hpd-after-suspend.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor@atomic-transitions:
    - fi-bsw-n3050:       [PASS][3] -> [FAIL][4] ([i915#6298])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/fi-bsw-n3050/igt@kms_cursor_legacy@basic-busy-flip-before-cursor@atomic-transitions.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/fi-bsw-n3050/igt@kms_cursor_legacy@basic-busy-flip-before-cursor@atomic-transitions.html

  * igt@kms_pipe_crc_basic@suspend-read-crc:
    - bat-dg2-11:         NOTRUN -> [SKIP][5] ([i915#5354])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/bat-dg2-11/igt@kms_pipe_crc_basic@suspend-read-crc.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-apl-guc:         [DMESG-FAIL][6] ([i915#5334]) -> [PASS][7]
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html

  * igt@i915_selftest@live@guc:
    - bat-rpls-2:         [DMESG-WARN][8] ([i915#7852]) -> [PASS][9]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/bat-rpls-2/igt@i915_selftest@live@guc.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/bat-rpls-2/igt@i915_selftest@live@guc.html

  * igt@i915_selftest@live@workarounds:
    - bat-dg2-11:         [INCOMPLETE][10] ([i915#7913]) -> [PASS][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/bat-dg2-11/igt@i915_selftest@live@workarounds.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/bat-dg2-11/igt@i915_selftest@live@workarounds.html

  
#### Warnings ####

  * igt@i915_selftest@live@slpc:
    - bat-rpls-2:         [DMESG-FAIL][12] ([i915#6997]) -> [DMESG-FAIL][13] ([i915#6367] / [i915#7996])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/bat-rpls-2/igt@i915_selftest@live@slpc.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/bat-rpls-2/igt@i915_selftest@live@slpc.html
    - bat-rpls-1:         [DMESG-FAIL][14] ([i915#6367] / [i915#7996]) -> [DMESG-FAIL][15] ([i915#6367])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/bat-rpls-1/igt@i915_selftest@live@slpc.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/bat-rpls-1/igt@i915_selftest@live@slpc.html

  
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#5354]: https://gitlab.freedesktop.org/drm/intel/issues/5354
  [i915#6298]: https://gitlab.freedesktop.org/drm/intel/issues/6298
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6997]: https://gitlab.freedesktop.org/drm/intel/issues/6997
  [i915#7699]: https://gitlab.freedesktop.org/drm/intel/issues/7699
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#7852]: https://gitlab.freedesktop.org/drm/intel/issues/7852
  [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
  [i915#7996]: https://gitlab.freedesktop.org/drm/intel/issues/7996


Build changes
-------------

  * IGT: IGT_7161 -> IGTPW_8498
  * Linux: CI_DRM_12754 -> Patchwork_114033v2

  CI-20190529: 20190529
  CI_DRM_12754: e9f03769fd297b17cd356ec6274e4824511e750c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_8498: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8498/index.html
  IGT_7161: 5574f110ae838031eef6db5236bad02e8c2d2dee @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_114033v2: e9f03769fd297b17cd356ec6274e4824511e750c @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

81ba197c0458 drm/i915/perf: Add support for OA media units
969a31fc8893 drm/i915/perf: Add engine class instance parameters to perf
051fda74a777 drm/i915/perf: Handle non-power-of-2 reports
a3eec22925b2 drm/i915/perf: Parse 64bit report header formats correctly
03af54870886 drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM
1847444238ee drm/i915/perf: Group engines into respective OA groups
56dd42e55f88 drm/i915/perf: Validate OA sseu config outside switch
f640ebc3c42f drm/i915/perf: Add helper to check supported OA engines
c8af7e3a03e8 drm/i915/perf: Drop wakeref on GuC RC error

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/index.html

[-- Attachment #2: Type: text/html, Size: 6568 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 5/9] drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 5/9] drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM Umesh Nerlige Ramappa
@ 2023-02-17  2:04   ` Dixit, Ashutosh
  2023-02-17  9:55     ` Jani Nikula
  0 siblings, 1 reply; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-17  2:04 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Thu, 16 Feb 2023 16:58:46 -0800, Umesh Nerlige Ramappa wrote:
>
> i915_perf_init can fail due to OOM. Fail driver init if i915_perf_init
> fails.
>
> v2: (Jani)
> - Reorder patch in the series

Jani seemed ok with this: that a drm_err will get lost in the dmesg deluge
on OOM so it's better to fail the probe as long as it's only due to OOM.

> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_driver.c | 4 +++-
>  drivers/gpu/drm/i915/i915_perf.c   | 8 ++++++--
>  drivers/gpu/drm/i915/i915_perf.h   | 2 +-
>  3 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> index 0c0ae3eabb4b..998ca41c9713 100644
> --- a/drivers/gpu/drm/i915/i915_driver.c
> +++ b/drivers/gpu/drm/i915/i915_driver.c
> @@ -477,7 +477,9 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
>	if (ret)
>		return ret;
>
> -	i915_perf_init(dev_priv);
> +	ret = i915_perf_init(dev_priv);

Maybe add a comment here like this to allay people's concerns?

	/* The only non-zero ret here is -ENOMEM */

or even:

	drm_WARN_ON(&dev_priv->drm, ret && ret != -ENOMEM);

Otherwise this is:

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>

> +	if (ret)
> +		return ret;
>
>	ret = i915_ggtt_probe_hw(dev_priv);
>	if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 37c4cc44d68c..3306653c0b85 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -4941,7 +4941,7 @@ static void i915_perf_init_info(struct drm_i915_private *i915)
>   * Note: i915-perf initialization is split into an 'init' and 'register'
>   * phase with the i915_perf_register() exposing state to userspace.
>   */
> -void i915_perf_init(struct drm_i915_private *i915)
> +int i915_perf_init(struct drm_i915_private *i915)
>  {
>	struct i915_perf *perf = &i915->perf;
>
> @@ -5057,12 +5057,16 @@ void i915_perf_init(struct drm_i915_private *i915)
>		perf->i915 = i915;
>
>		ret = oa_init_engine_groups(perf);
> -		if (ret)
> +		if (ret) {
>			drm_err(&i915->drm,
>				"OA initialization failed %d\n", ret);
> +			return ret;
> +		}
>
>		oa_init_supported_formats(perf);
>	}
> +
> +	return 0;
>  }
>
>  static int destroy_config(int id, void *p, void *data)
> diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h
> index f96e09a4af04..253637651d5e 100644
> --- a/drivers/gpu/drm/i915/i915_perf.h
> +++ b/drivers/gpu/drm/i915/i915_perf.h
> @@ -18,7 +18,7 @@ struct i915_oa_config;
>  struct intel_context;
>  struct intel_engine_cs;
>
> -void i915_perf_init(struct drm_i915_private *i915);
> +int i915_perf_init(struct drm_i915_private *i915);
>  void i915_perf_fini(struct drm_i915_private *i915);
>  void i915_perf_register(struct drm_i915_private *i915);
>  void i915_perf_unregister(struct drm_i915_private *i915);
> --
> 2.36.1
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 5/9] drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM
  2023-02-17  2:04   ` Dixit, Ashutosh
@ 2023-02-17  9:55     ` Jani Nikula
  0 siblings, 0 replies; 33+ messages in thread
From: Jani Nikula @ 2023-02-17  9:55 UTC (permalink / raw)
  To: Dixit, Ashutosh, Umesh Nerlige Ramappa; +Cc: intel-gfx

On Thu, 16 Feb 2023, "Dixit, Ashutosh" <ashutosh.dixit@intel.com> wrote:
> On Thu, 16 Feb 2023 16:58:46 -0800, Umesh Nerlige Ramappa wrote:
>>
>> i915_perf_init can fail due to OOM. Fail driver init if i915_perf_init
>> fails.
>>
>> v2: (Jani)
>> - Reorder patch in the series
>
> Jani seemed ok with this: that a drm_err will get lost in the dmesg deluge
> on OOM so it's better to fail the probe as long as it's only due to OOM.
>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_driver.c | 4 +++-
>>  drivers/gpu/drm/i915/i915_perf.c   | 8 ++++++--
>>  drivers/gpu/drm/i915/i915_perf.h   | 2 +-
>>  3 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
>> index 0c0ae3eabb4b..998ca41c9713 100644
>> --- a/drivers/gpu/drm/i915/i915_driver.c
>> +++ b/drivers/gpu/drm/i915/i915_driver.c
>> @@ -477,7 +477,9 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
>>	if (ret)
>>		return ret;
>>
>> -	i915_perf_init(dev_priv);
>> +	ret = i915_perf_init(dev_priv);
>
> Maybe add a comment here like this to allay people's concerns?
>
> 	/* The only non-zero ret here is -ENOMEM */
>
> or even:
>
> 	drm_WARN_ON(&dev_priv->drm, ret && ret != -ENOMEM);

Frankly I would not clutter the high level hw probe function with the
above.

BR,
Jani.


>
> Otherwise this is:
>
> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
>
>> +	if (ret)
>> +		return ret;
>>
>>	ret = i915_ggtt_probe_hw(dev_priv);
>>	if (ret)
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index 37c4cc44d68c..3306653c0b85 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -4941,7 +4941,7 @@ static void i915_perf_init_info(struct drm_i915_private *i915)
>>   * Note: i915-perf initialization is split into an 'init' and 'register'
>>   * phase with the i915_perf_register() exposing state to userspace.
>>   */
>> -void i915_perf_init(struct drm_i915_private *i915)
>> +int i915_perf_init(struct drm_i915_private *i915)
>>  {
>>	struct i915_perf *perf = &i915->perf;
>>
>> @@ -5057,12 +5057,16 @@ void i915_perf_init(struct drm_i915_private *i915)
>>		perf->i915 = i915;
>>
>>		ret = oa_init_engine_groups(perf);
>> -		if (ret)
>> +		if (ret) {
>>			drm_err(&i915->drm,
>>				"OA initialization failed %d\n", ret);
>> +			return ret;
>> +		}
>>
>>		oa_init_supported_formats(perf);
>>	}
>> +
>> +	return 0;
>>  }
>>
>>  static int destroy_config(int id, void *p, void *data)
>> diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h
>> index f96e09a4af04..253637651d5e 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.h
>> +++ b/drivers/gpu/drm/i915/i915_perf.h
>> @@ -18,7 +18,7 @@ struct i915_oa_config;
>>  struct intel_context;
>>  struct intel_engine_cs;
>>
>> -void i915_perf_init(struct drm_i915_private *i915);
>> +int i915_perf_init(struct drm_i915_private *i915);
>>  void i915_perf_fini(struct drm_i915_private *i915);
>>  void i915_perf_register(struct drm_i915_private *i915);
>>  void i915_perf_unregister(struct drm_i915_private *i915);
>> --
>> 2.36.1
>>

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for Add OAM support for MTL (rev2)
  2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
                   ` (10 preceding siblings ...)
  2023-02-17  1:55 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2023-02-17 16:09 ` Patchwork
  11 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2023-02-17 16:09 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 46348 bytes --]

== Series Details ==

Series: Add OAM support for MTL (rev2)
URL   : https://patchwork.freedesktop.org/series/114033/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12754_full -> Patchwork_114033v2_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/index.html

Participating hosts (11 -> 10)
------------------------------

  Missing    (1): shard-rkl0 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_114033v2_full:

### IGT changes ###

#### Possible regressions ####

  * {igt@perf@enable-disable@0-rcs0} (NEW):
    - {shard-rkl}:        NOTRUN -> [INCOMPLETE][1] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-1/igt@perf@enable-disable@0-rcs0.html
    - {shard-dg1}:        NOTRUN -> [INCOMPLETE][2] +1 similar issue
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-dg1-18/igt@perf@enable-disable@0-rcs0.html
    - shard-apl:          NOTRUN -> [INCOMPLETE][3] +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-apl7/igt@perf@enable-disable@0-rcs0.html
    - {shard-tglu}:       NOTRUN -> [INCOMPLETE][4] +1 similar issue
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-5/igt@perf@enable-disable@0-rcs0.html
    - shard-glk:          NOTRUN -> [INCOMPLETE][5] +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk4/igt@perf@enable-disable@0-rcs0.html

  * {igt@perf@gen12-gt-exclusive-stream-ctx-handle} (NEW):
    - shard-apl:          NOTRUN -> [FAIL][6] +1 similar issue
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-apl6/igt@perf@gen12-gt-exclusive-stream-ctx-handle.html

  * {igt@perf@gen12-gt-exclusive-stream-sample-oa} (NEW):
    - shard-glk:          NOTRUN -> [FAIL][7] +1 similar issue
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk3/igt@perf@gen12-gt-exclusive-stream-sample-oa.html

  * {igt@perf@global-sseu-config-invalid@0-rcs0} (NEW):
    - {shard-tglu}:       NOTRUN -> [SKIP][8]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-3/igt@perf@global-sseu-config-invalid@0-rcs0.html

  * {igt@perf@global-sseu-config@0-rcs0} (NEW):
    - {shard-rkl}:        NOTRUN -> [SKIP][9] +1 similar issue
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-1/igt@perf@global-sseu-config@0-rcs0.html
    - shard-tglu-10:      NOTRUN -> [SKIP][10]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@perf@global-sseu-config@0-rcs0.html

  * {igt@perf@stress-open-close@0-rcs0} (NEW):
    - shard-glk:          NOTRUN -> [ABORT][11]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk6/igt@perf@stress-open-close@0-rcs0.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_ccs@pipe-a-missing-ccs-buffer-y_tiled_gen12_rc_ccs:
    - {shard-dg1}:        NOTRUN -> [DMESG-WARN][12]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-dg1-13/igt@kms_ccs@pipe-a-missing-ccs-buffer-y_tiled_gen12_rc_ccs.html

  * {igt@kms_plane_scaling@invalid-parameters}:
    - {shard-rkl}:        [SKIP][13] ([i915#8152]) -> [SKIP][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-5/igt@kms_plane_scaling@invalid-parameters.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-1/igt@kms_plane_scaling@invalid-parameters.html

  
New tests
---------

  New tests have been introduced between CI_DRM_12754_full and Patchwork_114033v2_full:

### New IGT tests (17) ###

  * igt@perf@blocking@0-rcs0:
    - Statuses : 5 pass(s)
    - Exec time: [0.0] s

  * igt@perf@buffer-fill@0-rcs0:
    - Statuses : 5 incomplete(s)
    - Exec time: [0.0] s

  * igt@perf@enable-disable@0-rcs0:
    - Statuses : 5 incomplete(s)
    - Exec time: [0.0] s

  * igt@perf@gen12-gt-concurrent-oa-buffer-read:
    - Statuses : 4 pass(s) 1 skip(s)
    - Exec time: [0.0] s

  * igt@perf@gen12-gt-exclusive-stream-ctx-handle:
    - Statuses : 2 fail(s) 3 pass(s) 1 skip(s)
    - Exec time: [0.0] s

  * igt@perf@gen12-gt-exclusive-stream-sample-oa:
    - Statuses : 2 fail(s) 3 pass(s) 1 skip(s)
    - Exec time: [0.0] s

  * igt@perf@gen12-invalid-class-instance:
    - Statuses : 3 pass(s) 1 skip(s)
    - Exec time: [0.0] s

  * igt@perf@gen12-mi-rpc@rcs0:
    - Statuses : 3 pass(s)
    - Exec time: [0.0] s

  * igt@perf@gen12-oa-tlb-invalidate@0-rcs0:
    - Statuses : 2 pass(s)
    - Exec time: [0.0] s

  * igt@perf@gen12-unprivileged-single-ctx-counters@rcs0:
    - Statuses : 2 pass(s)
    - Exec time: [0.0] s

  * igt@perf@global-sseu-config-invalid@0-rcs0:
    - Statuses : 4 skip(s)
    - Exec time: [0.0] s

  * igt@perf@global-sseu-config@0-rcs0:
    - Statuses : 4 skip(s)
    - Exec time: [0.0] s

  * igt@perf@non-zero-reason@0-rcs0:
    - Statuses : 5 pass(s)
    - Exec time: [0.0] s

  * igt@perf@oa-exponents@0-rcs0:
    - Statuses : 4 pass(s)
    - Exec time: [0.0] s

  * igt@perf@oa-formats@0-rcs0:
    - Statuses : 4 pass(s)
    - Exec time: [0.0] s

  * igt@perf@polling@0-rcs0:
    - Statuses : 5 pass(s)
    - Exec time: [0.0] s

  * igt@perf@stress-open-close@0-rcs0:
    - Statuses : 1 abort(s) 4 pass(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in Patchwork_114033v2_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@fbdev@pan:
    - shard-tglu-9:       [PASS][15] -> [SKIP][16] ([i915#2582])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-tglu-9/igt@fbdev@pan.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@fbdev@pan.html

  * igt@gem_close_race@multigpu-basic-threads:
    - shard-tglu-9:       NOTRUN -> [SKIP][17] ([i915#7697])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@gem_close_race@multigpu-basic-threads.html

  * igt@gem_exec_capture@capture-recoverable:
    - shard-tglu-10:      NOTRUN -> [SKIP][18] ([i915#6344])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_exec_capture@capture-recoverable.html

  * igt@gem_exec_fair@basic-none-vip@rcs0:
    - shard-glk:          NOTRUN -> [FAIL][19] ([i915#2842])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk7/igt@gem_exec_fair@basic-none-vip@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs0:
    - shard-glk:          [PASS][20] -> [FAIL][21] ([i915#2842]) +2 similar issues
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-glk1/igt@gem_exec_fair@basic-pace@vcs0.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk3/igt@gem_exec_fair@basic-pace@vcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-tglu-9:       NOTRUN -> [FAIL][22] ([i915#2842])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_params@secure-non-master:
    - shard-tglu-10:      NOTRUN -> [SKIP][23] ([fdo#112283])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_exec_params@secure-non-master.html

  * igt@gem_lmem_evict@dontneed-evict-race:
    - shard-tglu-10:      NOTRUN -> [SKIP][24] ([i915#7582])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_lmem_evict@dontneed-evict-race.html

  * igt@gem_lmem_swapping@heavy-random:
    - shard-tglu-10:      NOTRUN -> [SKIP][25] ([i915#4613])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_lmem_swapping@heavy-random.html

  * igt@gem_lmem_swapping@heavy-verify-random-ccs:
    - shard-glk:          NOTRUN -> [SKIP][26] ([fdo#109271] / [i915#4613]) +1 similar issue
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk8/igt@gem_lmem_swapping@heavy-verify-random-ccs.html

  * igt@gem_lmem_swapping@parallel-random-verify-ccs:
    - shard-tglu-9:       NOTRUN -> [SKIP][27] ([i915#4613]) +1 similar issue
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@gem_lmem_swapping@parallel-random-verify-ccs.html

  * igt@gem_pxp@create-regular-buffer:
    - shard-tglu-10:      NOTRUN -> [SKIP][28] ([i915#4270]) +2 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_pxp@create-regular-buffer.html

  * igt@gem_pxp@display-protected-crc:
    - shard-tglu-9:       NOTRUN -> [SKIP][29] ([i915#4270])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@gem_pxp@display-protected-crc.html

  * igt@gem_softpin@evict-snoop-interruptible:
    - shard-tglu-10:      NOTRUN -> [SKIP][30] ([fdo#109312])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_softpin@evict-snoop-interruptible.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-tglu-10:      NOTRUN -> [SKIP][31] ([i915#3323])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@readonly-pwrite-unsync:
    - shard-tglu-10:      NOTRUN -> [SKIP][32] ([i915#3297])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_userptr_blits@readonly-pwrite-unsync.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-tglu-10:      NOTRUN -> [FAIL][33] ([i915#3318])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gem_userptr_blits@vma-merge.html

  * igt@gen7_exec_parse@basic-allowed:
    - shard-tglu-9:       NOTRUN -> [SKIP][34] ([fdo#109289]) +1 similar issue
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@gen7_exec_parse@basic-allowed.html

  * igt@gen7_exec_parse@cmd-crossing-page:
    - shard-tglu-10:      NOTRUN -> [SKIP][35] ([fdo#109289]) +2 similar issues
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gen7_exec_parse@cmd-crossing-page.html

  * igt@gen9_exec_parse@batch-invalid-length:
    - shard-tglu-10:      NOTRUN -> [SKIP][36] ([i915#2527] / [i915#2856]) +1 similar issue
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@gen9_exec_parse@batch-invalid-length.html

  * igt@i915_module_load@load:
    - shard-tglu-10:      NOTRUN -> [SKIP][37] ([i915#6227])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@i915_module_load@load.html

  * igt@i915_pm_backlight@basic-brightness:
    - shard-tglu-10:      NOTRUN -> [SKIP][38] ([i915#7561])
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@i915_pm_backlight@basic-brightness.html

  * igt@i915_pm_dc@dc5-psr:
    - shard-tglu-9:       NOTRUN -> [SKIP][39] ([i915#658])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@i915_pm_dc@dc5-psr.html

  * igt@i915_pm_dc@dc9-dpms:
    - shard-apl:          [PASS][40] -> [SKIP][41] ([fdo#109271])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-apl1/igt@i915_pm_dc@dc9-dpms.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-apl6/igt@i915_pm_dc@dc9-dpms.html

  * igt@i915_pm_rc6_residency@rc6-idle@vcs0:
    - shard-tglu-10:      NOTRUN -> [WARN][42] ([i915#2681]) +1 similar issue
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@i915_pm_rc6_residency@rc6-idle@vcs0.html

  * igt@i915_pm_rc6_residency@rc6-idle@vecs0:
    - shard-tglu-10:      NOTRUN -> [FAIL][43] ([i915#2681] / [i915#3591])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@i915_pm_rc6_residency@rc6-idle@vecs0.html

  * igt@i915_pm_rpm@dpms-mode-unset-lpsp:
    - shard-tglu-9:       NOTRUN -> [SKIP][44] ([i915#1397])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@i915_pm_rpm@dpms-mode-unset-lpsp.html

  * igt@i915_pm_rpm@modeset-non-lpsp-stress-no-wait:
    - shard-tglu-10:      NOTRUN -> [SKIP][45] ([fdo#111644] / [i915#1397]) +1 similar issue
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@i915_pm_rpm@modeset-non-lpsp-stress-no-wait.html

  * igt@kms_big_fb@4-tiled-64bpp-rotate-180:
    - shard-tglu-10:      NOTRUN -> [SKIP][46] ([i915#5286]) +3 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_big_fb@4-tiled-64bpp-rotate-180.html

  * igt@kms_big_fb@linear-64bpp-rotate-90:
    - shard-tglu-10:      NOTRUN -> [SKIP][47] ([fdo#111614])
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_big_fb@linear-64bpp-rotate-90.html

  * igt@kms_big_fb@yf-tiled-16bpp-rotate-180:
    - shard-tglu-10:      NOTRUN -> [SKIP][48] ([fdo#111615]) +1 similar issue
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_big_fb@yf-tiled-16bpp-rotate-180.html

  * igt@kms_big_fb@yf-tiled-addfb-size-offset-overflow:
    - shard-tglu-9:       NOTRUN -> [SKIP][49] ([fdo#111615] / [i915#1845] / [i915#7651]) +3 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_big_fb@yf-tiled-addfb-size-offset-overflow.html

  * igt@kms_ccs@pipe-a-crc-sprite-planes-basic-4_tiled_dg2_rc_ccs_cc:
    - shard-tglu-10:      NOTRUN -> [SKIP][50] ([i915#6095]) +1 similar issue
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_ccs@pipe-a-crc-sprite-planes-basic-4_tiled_dg2_rc_ccs_cc.html

  * igt@kms_ccs@pipe-a-random-ccs-data-4_tiled_dg2_rc_ccs:
    - shard-tglu-10:      NOTRUN -> [SKIP][51] ([i915#3689] / [i915#6095]) +2 similar issues
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_ccs@pipe-a-random-ccs-data-4_tiled_dg2_rc_ccs.html

  * igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_ccs:
    - shard-tglu-10:      NOTRUN -> [SKIP][52] ([i915#3689]) +4 similar issues
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_ccs.html

  * igt@kms_ccs@pipe-c-bad-aux-stride-yf_tiled_ccs:
    - shard-tglu-10:      NOTRUN -> [SKIP][53] ([fdo#111615] / [i915#3689]) +4 similar issues
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_ccs@pipe-c-bad-aux-stride-yf_tiled_ccs.html

  * igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_mc_ccs:
    - shard-tglu-9:       NOTRUN -> [SKIP][54] ([i915#1845] / [i915#7651]) +51 similar issues
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-c-missing-ccs-buffer-y_tiled_gen12_rc_ccs_cc:
    - shard-glk:          NOTRUN -> [SKIP][55] ([fdo#109271] / [i915#3886]) +4 similar issues
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk1/igt@kms_ccs@pipe-c-missing-ccs-buffer-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_mc_ccs:
    - shard-tglu-10:      NOTRUN -> [SKIP][56] ([i915#3689] / [i915#3886]) +2 similar issues
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_mc_ccs.html

  * igt@kms_cdclk@plane-scaling:
    - shard-tglu-9:       NOTRUN -> [SKIP][57] ([i915#3742])
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_cdclk@plane-scaling.html

  * igt@kms_chamelium_color@ctm-0-25:
    - shard-tglu-10:      NOTRUN -> [SKIP][58] ([fdo#111827])
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_chamelium_color@ctm-0-25.html

  * igt@kms_chamelium_color@gamma:
    - shard-tglu-9:       NOTRUN -> [SKIP][59] ([fdo#111827])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_chamelium_color@gamma.html

  * igt@kms_chamelium_edid@hdmi-mode-timings:
    - shard-tglu-9:       NOTRUN -> [SKIP][60] ([i915#7828]) +2 similar issues
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_chamelium_edid@hdmi-mode-timings.html

  * igt@kms_chamelium_hpd@hdmi-hpd-with-enabled-mode:
    - shard-tglu-10:      NOTRUN -> [SKIP][61] ([i915#7828]) +5 similar issues
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_chamelium_hpd@hdmi-hpd-with-enabled-mode.html

  * igt@kms_color@ctm-red-to-blue:
    - shard-tglu-9:       NOTRUN -> [SKIP][62] ([i915#3546])
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_color@ctm-red-to-blue.html

  * igt@kms_content_protection@dp-mst-lic-type-1:
    - shard-tglu-10:      NOTRUN -> [SKIP][63] ([i915#3116] / [i915#3299])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_content_protection@dp-mst-lic-type-1.html

  * igt@kms_content_protection@uevent:
    - shard-tglu-10:      NOTRUN -> [SKIP][64] ([i915#6944] / [i915#7116] / [i915#7118])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_content_protection@uevent.html

  * igt@kms_cursor_crc@cursor-rapid-movement-64x64:
    - shard-tglu-9:       NOTRUN -> [SKIP][65] ([i915#1845]) +5 similar issues
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_cursor_crc@cursor-rapid-movement-64x64.html

  * igt@kms_cursor_legacy@2x-cursor-vs-flip-atomic:
    - shard-tglu-10:      NOTRUN -> [SKIP][66] ([fdo#109274])
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_cursor_legacy@2x-cursor-vs-flip-atomic.html

  * igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions:
    - shard-glk:          [PASS][67] -> [FAIL][68] ([i915#2346])
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-glk1/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk1/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions.html

  * igt@kms_display_modes@extended-mode-basic:
    - shard-tglu-9:       NOTRUN -> [SKIP][69] ([fdo#109274]) +2 similar issues
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_display_modes@extended-mode-basic.html

  * igt@kms_dsc@dsc-with-formats:
    - shard-tglu-10:      NOTRUN -> [SKIP][70] ([i915#3555]) +2 similar issues
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_dsc@dsc-with-formats.html

  * igt@kms_fbcon_fbt@fbc-suspend:
    - shard-glk:          NOTRUN -> [FAIL][71] ([i915#4767])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk7/igt@kms_fbcon_fbt@fbc-suspend.html

  * igt@kms_flip@2x-absolute-wf_vblank:
    - shard-tglu-10:      NOTRUN -> [SKIP][72] ([fdo#109274] / [i915#3637] / [i915#3966])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_flip@2x-absolute-wf_vblank.html

  * igt@kms_flip@2x-flip-vs-fences:
    - shard-tglu-9:       NOTRUN -> [SKIP][73] ([fdo#109274] / [i915#3637]) +2 similar issues
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_flip@2x-flip-vs-fences.html

  * igt@kms_flip@2x-plain-flip-ts-check-interruptible:
    - shard-tglu-10:      NOTRUN -> [SKIP][74] ([fdo#109274] / [i915#3637]) +4 similar issues
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_flip@2x-plain-flip-ts-check-interruptible.html

  * igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1:
    - shard-glk:          [PASS][75] -> [FAIL][76] ([i915#79])
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-glk7/igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk5/igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1.html

  * igt@kms_flip@flip-vs-panning:
    - shard-tglu-9:       NOTRUN -> [SKIP][77] ([i915#3637]) +1 similar issue
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_flip@flip-vs-panning.html

  * igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-32bpp-4tiledg2rcccs-downscaling@pipe-a-valid-mode:
    - shard-tglu-10:      NOTRUN -> [SKIP][78] ([i915#2587] / [i915#2672])
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-32bpp-4tiledg2rcccs-downscaling@pipe-a-valid-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-downscaling:
    - shard-tglu-9:       NOTRUN -> [SKIP][79] ([i915#3555]) +4 similar issues
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-downscaling.html

  * igt@kms_force_connector_basic@force-load-detect:
    - shard-tglu-10:      NOTRUN -> [SKIP][80] ([fdo#109285])
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_force_connector_basic@force-load-detect.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-draw-render:
    - shard-tglu-10:      NOTRUN -> [SKIP][81] ([fdo#109280]) +21 similar issues
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@fbcpsr-modesetfrombusy:
    - shard-tglu-9:       NOTRUN -> [SKIP][82] ([i915#1849]) +34 similar issues
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_frontbuffer_tracking@fbcpsr-modesetfrombusy.html

  * igt@kms_frontbuffer_tracking@psr-rgb101010-draw-blt:
    - shard-tglu-10:      NOTRUN -> [SKIP][83] ([fdo#110189]) +16 similar issues
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_frontbuffer_tracking@psr-rgb101010-draw-blt.html

  * igt@kms_multipipe_modeset@basic-max-pipe-crc-check:
    - shard-tglu-10:      NOTRUN -> [SKIP][84] ([i915#1839])
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_multipipe_modeset@basic-max-pipe-crc-check.html

  * igt@kms_plane@pixel-format-source-clamping@pipe-a-planes:
    - shard-tglu-9:       NOTRUN -> [SKIP][85] ([i915#1849] / [i915#3558]) +1 similar issue
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_plane@pixel-format-source-clamping@pipe-a-planes.html

  * igt@kms_plane_scaling@plane-downscale-with-rotation-factor-0-25@pipe-c-hdmi-a-1:
    - shard-tglu-10:      NOTRUN -> [SKIP][86] ([i915#5176]) +3 similar issues
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_plane_scaling@plane-downscale-with-rotation-factor-0-25@pipe-c-hdmi-a-1.html

  * igt@kms_plane_scaling@plane-upscale-with-modifiers-factor-0-25@pipe-a-vga-1:
    - shard-snb:          NOTRUN -> [SKIP][87] ([fdo#109271]) +24 similar issues
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-snb5/igt@kms_plane_scaling@plane-upscale-with-modifiers-factor-0-25@pipe-a-vga-1.html

  * igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5:
    - shard-tglu-9:       NOTRUN -> [SKIP][88] ([i915#6953] / [i915#8152]) +1 similar issue
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5.html

  * igt@kms_prime@d3hot:
    - shard-tglu-10:      NOTRUN -> [SKIP][89] ([i915#6524])
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_prime@d3hot.html

  * igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf:
    - shard-tglu-10:      NOTRUN -> [SKIP][90] ([i915#658]) +2 similar issues
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf.html

  * igt@kms_psr2_sf@plane-move-sf-dmg-area:
    - shard-glk:          NOTRUN -> [SKIP][91] ([fdo#109271] / [i915#658])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk9/igt@kms_psr2_sf@plane-move-sf-dmg-area.html

  * igt@kms_psr@basic:
    - shard-tglu-9:       NOTRUN -> [SKIP][92] ([fdo#110189]) +4 similar issues
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_psr@basic.html

  * igt@kms_psr_stress_test@invalidate-primary-flip-overlay:
    - shard-tglu-9:       NOTRUN -> [SKIP][93] ([i915#5461])
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@kms_psr_stress_test@invalidate-primary-flip-overlay.html

  * igt@kms_tv_load_detect@load-detect:
    - shard-tglu-10:      NOTRUN -> [SKIP][94] ([fdo#109309])
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@kms_tv_load_detect@load-detect.html

  * igt@kms_vblank@pipe-d-wait-busy-hang:
    - shard-glk:          NOTRUN -> [SKIP][95] ([fdo#109271]) +88 similar issues
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk4/igt@kms_vblank@pipe-d-wait-busy-hang.html

  * {igt@perf@global-sseu-config@0-rcs0} (NEW):
    - shard-apl:          NOTRUN -> [SKIP][96] ([fdo#109271]) +1 similar issue
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-apl1/igt@perf@global-sseu-config@0-rcs0.html

  * igt@v3d/v3d_create_bo@create-bo-0:
    - shard-tglu-10:      NOTRUN -> [SKIP][97] ([fdo#109315] / [i915#2575])
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@v3d/v3d_create_bo@create-bo-0.html

  * igt@v3d/v3d_perfmon@get-values-valid-perfmon:
    - shard-tglu-9:       NOTRUN -> [SKIP][98] ([fdo#109315] / [i915#2575]) +1 similar issue
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@v3d/v3d_perfmon@get-values-valid-perfmon.html

  * igt@vc4/vc4_mmap@mmap-bad-handle:
    - shard-tglu-9:       NOTRUN -> [SKIP][99] ([i915#2575]) +2 similar issues
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-9/igt@vc4/vc4_mmap@mmap-bad-handle.html

  * igt@vc4/vc4_perfmon@create-perfmon-exceed:
    - shard-tglu-10:      NOTRUN -> [SKIP][100] ([i915#2575]) +5 similar issues
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-tglu-10/igt@vc4/vc4_perfmon@create-perfmon-exceed.html

  
#### Possible fixes ####

  * igt@drm_fdinfo@virtual-idle:
    - {shard-rkl}:        [FAIL][101] ([i915#7742]) -> [PASS][102]
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-2/igt@drm_fdinfo@virtual-idle.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-6/igt@drm_fdinfo@virtual-idle.html

  * igt@gem_eio@in-flight-suspend:
    - {shard-rkl}:        [FAIL][103] ([i915#5115] / [i915#7052]) -> [PASS][104]
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-3/igt@gem_eio@in-flight-suspend.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-6/igt@gem_eio@in-flight-suspend.html

  * igt@gem_exec_fair@basic-deadline:
    - {shard-rkl}:        [FAIL][105] ([i915#2846]) -> [PASS][106]
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-2/igt@gem_exec_fair@basic-deadline.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-5/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
    - shard-apl:          [FAIL][107] ([i915#2842]) -> [PASS][108]
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-apl1/igt@gem_exec_fair@basic-none-solo@rcs0.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-apl7/igt@gem_exec_fair@basic-none-solo@rcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-glk:          [FAIL][109] ([i915#2842]) -> [PASS][110]
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-glk3/igt@gem_exec_fair@basic-throttle@rcs0.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-glk5/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_reloc@basic-write-read-noreloc:
    - {shard-rkl}:        [SKIP][111] ([i915#3281]) -> [PASS][112] +4 similar issues
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-1/igt@gem_exec_reloc@basic-write-read-noreloc.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-5/igt@gem_exec_reloc@basic-write-read-noreloc.html

  * igt@gem_partial_pwrite_pread@writes-after-reads-display:
    - {shard-rkl}:        [SKIP][113] ([i915#3282]) -> [PASS][114] +2 similar issues
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-1/igt@gem_partial_pwrite_pread@writes-after-reads-display.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-5/igt@gem_partial_pwrite_pread@writes-after-reads-display.html

  * igt@gen9_exec_parse@secure-batches:
    - {shard-rkl}:        [SKIP][115] ([i915#2527]) -> [PASS][116] +1 similar issue
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-2/igt@gen9_exec_parse@secure-batches.html
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-5/igt@gen9_exec_parse@secure-batches.html

  * igt@i915_pm_dc@dc5-dpms:
    - {shard-rkl}:        [FAIL][117] ([i915#7330]) -> [PASS][118]
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-5/igt@i915_pm_dc@dc5-dpms.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-4/igt@i915_pm_dc@dc5-dpms.html

  * igt@i915_pm_rpm@drm-resources-equal:
    - {shard-rkl}:        [SKIP][119] ([fdo#109308]) -> [PASS][120]
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-5/igt@i915_pm_rpm@drm-resources-equal.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-6/igt@i915_pm_rpm@drm-resources-equal.html

  * igt@i915_pm_rps@engine-order:
    - shard-apl:          [FAIL][121] ([i915#6537]) -> [PASS][122]
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-apl7/igt@i915_pm_rps@engine-order.html
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-apl4/igt@i915_pm_rps@engine-order.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - {shard-rkl}:        [FAIL][123] ([fdo#103375]) -> [PASS][124]
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-4/igt@i915_suspend@fence-restore-tiled2untiled.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-4/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
    - {shard-rkl}:        [SKIP][125] ([i915#1845] / [i915#4098]) -> [PASS][126] +20 similar issues
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-3/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-6/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html

  * igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size:
    - shard-apl:          [FAIL][127] ([i915#2346]) -> [PASS][128]
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-apl7/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size.html
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-apl4/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-gtt:
    - {shard-rkl}:        [SKIP][129] ([i915#1849] / [i915#4098]) -> [PASS][130] +14 similar issues
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-gtt.html
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-6/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-gtt.html

  * igt@kms_plane@plane-position-hole-dpms@pipe-b-planes:
    - {shard-rkl}:        [SKIP][131] ([i915#1849]) -> [PASS][132] +1 similar issue
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-4/igt@kms_plane@plane-position-hole-dpms@pipe-b-planes.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-6/igt@kms_plane@plane-position-hole-dpms@pipe-b-planes.html

  * igt@kms_psr@sprite_mmap_gtt:
    - {shard-rkl}:        [SKIP][133] ([i915#1072]) -> [PASS][134] +1 similar issue
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-1/igt@kms_psr@sprite_mmap_gtt.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-6/igt@kms_psr@sprite_mmap_gtt.html

  * igt@perf@gen8-unprivileged-single-ctx-counters:
    - {shard-rkl}:        [SKIP][135] ([i915#2436]) -> [PASS][136]
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-3/igt@perf@gen8-unprivileged-single-ctx-counters.html
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-5/igt@perf@gen8-unprivileged-single-ctx-counters.html

  * igt@perf_pmu@idle@rcs0:
    - {shard-rkl}:        [FAIL][137] ([i915#4349]) -> [PASS][138]
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-3/igt@perf_pmu@idle@rcs0.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-1/igt@perf_pmu@idle@rcs0.html

  * igt@prime_vgem@basic-write:
    - {shard-rkl}:        [SKIP][139] ([fdo#109295] / [i915#3291] / [i915#3708]) -> [PASS][140]
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12754/shard-rkl-3/igt@prime_vgem@basic-write.html
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/shard-rkl-5/igt@prime_vgem@basic-write.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103375]: https://bugs.freedesktop.org/show_bug.cgi?id=103375
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109274]: https://bugs.freedesktop.org/show_bug.cgi?id=109274
  [fdo#109280]: https://bugs.freedesktop.org/show_bug.cgi?id=109280
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109289]: https://bugs.freedesktop.org/show_bug.cgi?id=109289
  [fdo#109291]: https://bugs.freedesktop.org/show_bug.cgi?id=109291
  [fdo#109295]: https://bugs.freedesktop.org/show_bug.cgi?id=109295
  [fdo#109302]: https://bugs.freedesktop.org/show_bug.cgi?id=109302
  [fdo#109303]: https://bugs.freedesktop.org/show_bug.cgi?id=109303
  [fdo#109307]: https://bugs.freedesktop.org/show_bug.cgi?id=109307
  [fdo#109308]: https://bugs.freedesktop.org/show_bug.cgi?id=109308
  [fdo#109309]: https://bugs.freedesktop.org/show_bug.cgi?id=109309
  [fdo#109312]: https://bugs.freedesktop.org/show_bug.cgi?id=109312
  [fdo#109313]: https://bugs.freedesktop.org/show_bug.cgi?id=109313
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#109506]: https://bugs.freedesktop.org/show_bug.cgi?id=109506
  [fdo#110189]: https://bugs.freedesktop.org/show_bug.cgi?id=110189
  [fdo#110723]: https://bugs.freedesktop.org/show_bug.cgi?id=110723
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111614]: https://bugs.freedesktop.org/show_bug.cgi?id=111614
  [fdo#111615]: https://bugs.freedesktop.org/show_bug.cgi?id=111615
  [fdo#111644]: https://bugs.freedesktop.org/show_bug.cgi?id=111644
  [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [fdo#112054]: https://bugs.freedesktop.org/show_bug.cgi?id=112054
  [fdo#112283]: https://bugs.freedesktop.org/show_bug.cgi?id=112283
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#132]: https://gitlab.freedesktop.org/drm/intel/issues/132
  [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397
  [i915#1769]: https://gitlab.freedesktop.org/drm/intel/issues/1769
  [i915#1825]: https://gitlab.freedesktop.org/drm/intel/issues/1825
  [i915#1839]: https://gitlab.freedesktop.org/drm/intel/issues/1839
  [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845
  [i915#1849]: https://gitlab.freedesktop.org/drm/intel/issues/1849
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2434]: https://gitlab.freedesktop.org/drm/intel/issues/2434
  [i915#2435]: https://gitlab.freedesktop.org/drm/intel/issues/2435
  [i915#2436]: https://gitlab.freedesktop.org/drm/intel/issues/2436
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2527]: https://gitlab.freedesktop.org/drm/intel/issues/2527
  [i915#2575]: https://gitlab.freedesktop.org/drm/intel/issues/2575
  [i915#2582]: https://gitlab.freedesktop.org/drm/intel/issues/2582
  [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587
  [i915#2658]: https://gitlab.freedesktop.org/drm/intel/issues/2658
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2681]: https://gitlab.freedesktop.org/drm/intel/issues/2681
  [i915#2705]: https://gitlab.freedesktop.org/drm/intel/issues/2705
  [i915#280]: https://gitlab.freedesktop.org/drm/intel/issues/280
  [i915#284]: https://gitlab.freedesktop.org/drm/intel/issues/284
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
  [i915#2856]: https://gitlab.freedesktop.org/drm/intel/issues/2856
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#3116]: https://gitlab.freedesktop.org/drm/intel/issues/3116
  [i915#315]: https://gitlab.freedesktop.org/drm/intel/issues/315
  [i915#3281]: https://gitlab.freedesktop.org/drm/intel/issues/3281
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291
  [i915#3297]: https://gitlab.freedesktop.org/drm/intel/issues/3297
  [i915#3299]: https://gitlab.freedesktop.org/drm/intel/issues/3299
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3318]: https://gitlab.freedesktop.org/drm/intel/issues/3318
  [i915#3323]: https://gitlab.freedesktop.org/drm/intel/issues/3323
  [i915#3359]: https://gitlab.freedesktop.org/drm/intel/issues/3359
  [i915#3361]: https://gitlab.freedesktop.org/drm/intel/issues/3361
  [i915#3458]: https://gitlab.freedesktop.org/drm/intel/issues/3458
  [i915#3469]: https://gitlab.freedesktop.org/drm/intel/issues/3469
  [i915#3528]: https://gitlab.freedesktop.org/drm/intel/issues/3528
  [i915#3536]: https://gitlab.freedesktop.org/drm/intel/issues/3536
  [i915#3539]: https://gitlab.freedesktop.org/drm/intel/issues/3539
  [i915#3546]: https://gitlab.freedesktop.org/drm/intel/issues/3546
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3558]: https://gitlab.freedesktop.org/drm/intel/issues/3558
  [i915#3591]: https://gitlab.freedesktop.org/drm/intel/issues/3591
  [i915#3637]: https://gitlab.freedesktop.org/drm/intel/issues/3637
  [i915#3638]: https://gitlab.freedesktop.org/drm/intel/issues/3638
  [i915#3689]: https://gitlab.freedesktop.org/drm/intel/issues/3689
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#3734]: https://gitlab.freedesktop.org/drm/intel/issues/3734
  [i915#3742]: https://gitlab.freedesktop.org/drm/intel/issues/3742
  [i915#3804]: https://gitlab.freedesktop.org/drm/intel/issues/3804
  [i915#3840]: https://gitlab.freedesktop.org/drm/intel/issues/3840
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#3936]: https://gitlab.freedesktop.org/drm/intel/issues/3936
  [i915#3955]: https://gitlab.freedesktop.org/drm/intel/issues/3955
  [i915#3966]: https://gitlab.freedesktop.org/drm/intel/issues/3966
  [i915#4036]: https://gitlab.freedesktop.org/drm/intel/issues/4036
  [i915#404]: https://gitlab.freedesktop.org/drm/intel/issues/404
  [i915#4070]: https://gitlab.freedesktop.org/drm/intel/issues/4070
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4078]: https://gitlab.freedesktop.org/drm/intel/issues/4078
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4098]: https://gitlab.freedesktop.org/drm/intel/issues/4098
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212
  [i915#4213]: https://gitlab.freedesktop.org/drm/intel/issues/4213
  [i915#4270]: https://gitlab.freedesktop.org/drm/intel/issues/4270
  [i915#4281]: https://gitlab.freedesktop.org/drm/intel/issues/4281
  [i915#4349]: https://gitlab.freedesktop.org/drm/intel/issues/4349
  [i915#4387]: https://gitlab.freedesktop.org/drm/intel/issues/4387
  [i915#4391]: https://gitlab.freedesktop.org/drm/intel/issues/4391
  [i915#4538]: https://gitlab.freedesktop.org/drm/intel/issues/4538
  [i915#4565]: https://gitlab.freedesktop.org/drm/intel/issues/4565
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4767]: https://gitlab.freedesktop.org/drm/intel/issues/4767
  [i915#4771]: https://gitlab.freedesktop.org/drm/intel/issues/4771
  [i915#4812]: https://gitlab.freedesktop.org/drm/intel/issues/4812
  [i915#4833]: https://gitlab.freedesktop.org/drm/intel/issues/4833
  [i915#4852]: https://gitlab.freedesktop.org/drm/intel/issues/4852
  [i915#4860]: https://gitlab.freedesktop.org/drm/intel/issues/4860
  [i915#4879]: https://gitlab.freedesktop.org/drm/intel/issues/4879
  [i915#4880]: https://gitlab.freedesktop.org/drm/intel/issues/4880
  [i915#4881]: https://gitlab.freedesktop.org/drm/intel/issues/4881
  [i915#5115]: https://gitlab.freedesktop.org/drm/intel/issues/5115
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235
  [i915#5286]: https://gitlab.freedesktop.org/drm/intel/issues/5286
  [i915#5288]: https://gitlab.freedesktop.org/drm/intel/issues/5288
  [i915#5289]: https://gitlab.freedesktop.org/drm/intel/issues/5289
  [i915#5325]: https://gitlab.freedesktop.org/drm/intel/issues/5325
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#5431]: https://gitlab.freedesktop.org/drm/intel/issues/5431
  [i915#5439]: https://gitlab.freedesktop.org/drm/intel/issues/5439
  [i915#5461]: https://gitlab.freedesktop.org/drm/intel/issues/5461
  [i915#5563]: https://gitlab.freedesktop.org/drm/intel/issues/5563
  [i915#5784]: https://gitlab.freedesktop.org/drm/intel/issues/5784
  [i915#6095]: https://gitlab.freedesktop.org/drm/intel/issues/6095
  [i915#6227]: https://gitlab.freedesktop.org/drm/intel/issues/6227
  [i915#6248]: https://gitlab.freedesktop.org/drm/intel/issues/6248
  [i915#6301]: https://gitlab.freedesktop.org/drm/intel/issues/6301
  [i915#6334]: https://gitlab.freedesktop.org/drm/intel/issues/6334
  [i915#6335]: https://gitlab.freedesktop.org/drm/intel/issues/6335
  [i915#6344]: https://gitlab.freedesktop.org/drm/intel/issues/6344
  [i915#6433]: https://gitlab.freedesktop.org/drm/intel/issues/6433
  [i915#6497]: https://gitlab.freedesktop.org/drm/intel/issues/6497
  [i915#6524]: https://gitlab.freedesktop.org/drm/intel/issues/6524
  [i915#6537]: https://gitlab.freedesktop.org/drm/intel/issues/6537
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#6621]: https://gitlab.freedesktop.org/drm/intel/issues/6621
  [i915#6768]: https://gitlab.freedesktop.org/drm/intel/issues/6768
  [i915#6944]: https://gitlab.freedesktop.org/drm/intel/issues/6944
  [i915#6946]: https://gitlab.freedesktop.org/drm/intel/issues/6946
  [i915#6953]: https://gitlab.freedesktop.org/drm/intel/issues/6953
  [i915#7037]: https://gitlab.freedesktop.org/drm/intel/issues/7037
  [i915#7052]: https://gitlab.freedesktop.org/drm/intel/issues/7052
  [i915#7116]: https://gitlab.freedesktop.org/drm/intel/issues/7116
  [i915#7118]: https://gitlab.freedesktop.org/drm/intel/issues/7118
  [i915#7330]: https://gitlab.freedesktop.org/drm/intel/issues/7330
  [i915#7456]: https://gitlab.freedesktop.org/drm/intel/issues/7456
  [i915#7561]: https://gitlab.freedesktop.org/drm/intel/issues/7561
  [i915#7582]: https://gitlab.freedesktop.org/drm/intel/issues/7582
  [i915#7651]: https://gitlab.freedesktop.org/drm/intel/issues/7651
  [i915#7697]: https://gitlab.freedesktop.org/drm/intel/issues/7697
  [i915#7701]: https://gitlab.freedesktop.org/drm/intel/issues/7701
  [i915#7707]: https://gitlab.freedesktop.org/drm/intel/issues/7707
  [i915#7711]: https://gitlab.freedesktop.org/drm/intel/issues/7711
  [i915#7742]: https://gitlab.freedesktop.org/drm/intel/issues/7742
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#7949]: https://gitlab.freedesktop.org/drm/intel/issues/7949
  [i915#7957]: https://gitlab.freedesktop.org/drm/intel/issues/7957
  [i915#8152]: https://gitlab.freedesktop.org/drm/intel/issues/8152


Build changes
-------------

  * IGT: IGT_7161 -> IGTPW_8498
  * Linux: CI_DRM_12754 -> Patchwork_114033v2

  CI-20190529: 20190529
  CI_DRM_12754: e9f03769fd297b17cd356ec6274e4824511e750c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_8498: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8498/index.html
  IGT_7161: 5574f110ae838031eef6db5236bad02e8c2d2dee @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_114033v2: e9f03769fd297b17cd356ec6274e4824511e750c @ git://anongit.freedesktop.org/gfx-ci/linux

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114033v2/index.html

[-- Attachment #2: Type: text/html, Size: 47979 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports Umesh Nerlige Ramappa
@ 2023-02-17 20:58   ` Dixit, Ashutosh
  2023-02-18  0:05     ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-17 20:58 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Thu, 16 Feb 2023 16:58:48 -0800, Umesh Nerlige Ramappa wrote:
>

Hi Umesh, couple of nits below.

> Some of the newer OA formats are not powers of 2. For those formats,
> adjust the hw_tail accordingly when checking for new reports.
>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_perf.c | 50 ++++++++++++++++++--------------
>  1 file changed, 28 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 9715b964aa1e..d3a1892c93be 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -542,6 +542,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>	bool pollin;
>	u32 hw_tail;
>	u64 now;
> +	u32 partial_report_size;
>
>	/* We have to consider the (unlikely) possibility that read() errors
>	 * could result in an OA buffer reset which might reset the head and
> @@ -551,10 +552,16 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>
>	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
>
> -	/* The tail pointer increases in 64 byte increments,
> -	 * not in report_size steps...
> +	/* The tail pointer increases in 64 byte increments, whereas report
> +	 * sizes need not be integral multiples or 64 or powers of 2.
s/or/of/ ---------------------------------------^

Also I think report sizes can only be multiples of 64, the ones we have
seen till now definitely are. Also the lower 6 bits of tail pointer are 0.

> +	 * Compute potentially partially landed report in the OA buffer
>	 */
> -	hw_tail &= ~(report_size - 1);
> +	partial_report_size = OA_TAKEN(hw_tail, stream->oa_buffer.tail);
> +	partial_report_size %= report_size;
> +
> +	/* Subtract partial amount off the tail */
> +	hw_tail = gtt_offset + ((hw_tail - partial_report_size) &
> +				(stream->oa_buffer.vma->size - 1));

Couple of questions here because OA_TAKEN uses OA_BUFFER_SIZE and the above
expression uses stream->oa_buffer.vma->size:

1. Is 'OA_BUFFER_SIZE == stream->oa_buffer.vma->size'? We seem to be using
   the two interchaneably in the code.
2. If yes, can we add an assert about this in alloc_oa_buffer?
3. Can the above expression be changed to:

	hw_tail = gtt_offset + OA_TAKEN(hw_tail, partial_report_size);

It would be good to use the same construct if possible. Maybe we can even
change OA_TAKEN to something like:

#define OA_TAKEN(tail, head)    ((tail - head) & (stream->oa_buffer.vma->size - 1))

>
>	now = ktime_get_mono_fast_ns();
>
> @@ -677,6 +684,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>  {
>	int report_size = stream->oa_buffer.format->size;
>	struct drm_i915_perf_record_header header;
> +	int report_size_partial;
> +	u8 *oa_buf_end;
>
>	header.type = DRM_I915_PERF_RECORD_SAMPLE;
>	header.pad = 0;
> @@ -690,8 +699,21 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>		return -EFAULT;
>	buf += sizeof(header);
>
> -	if (copy_to_user(buf, report, report_size))
> +	oa_buf_end = stream->oa_buffer.vaddr +
> +		     stream->oa_buffer.vma->size;
> +	report_size_partial = oa_buf_end - report;
> +
> +	if (report_size_partial < report_size) {
> +		if (copy_to_user(buf, report, report_size_partial))
> +			return -EFAULT;
> +		buf += report_size_partial;
> +
> +		if (copy_to_user(buf, stream->oa_buffer.vaddr,
> +				 report_size - report_size_partial))
> +			return -EFAULT;
> +	} else if (copy_to_user(buf, report, report_size)) {
>		return -EFAULT;
> +	}
>
>	(*offset) += header.size;
>
> @@ -759,8 +781,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
>	 * all a power of two).
>	 */
>	if (drm_WARN_ONCE(&uncore->i915->drm,
> -			  head > OA_BUFFER_SIZE || head % report_size ||
> -			  tail > OA_BUFFER_SIZE || tail % report_size,
> +			  head > OA_BUFFER_SIZE ||
> +			  tail > OA_BUFFER_SIZE,

The comment above the if () also needs to be fixed.

Also, does it make sense to have 'head % 64 || tail % 64' checks above? As
I was saying above head and tail will be 64 byte aligned.

>			  "Inconsistent OA buffer pointers: head = %u, tail = %u\n",
>			  head, tail))
>		return -EIO;
> @@ -774,22 +796,6 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
>		u32 ctx_id;
>		u64 reason;
>
> -		/*
> -		 * All the report sizes factor neatly into the buffer
> -		 * size so we never expect to see a report split
> -		 * between the beginning and end of the buffer.
> -		 *
> -		 * Given the initial alignment check a misalignment
> -		 * here would imply a driver bug that would result
> -		 * in an overrun.
> -		 */
> -		if (drm_WARN_ON(&uncore->i915->drm,
> -				(OA_BUFFER_SIZE - head) < report_size)) {
> -			drm_err(&uncore->i915->drm,
> -				"Spurious OA head ptr: non-integral report offset\n");
> -			break;
> -		}
> -
>		/*
>		 * The reason field includes flags identifying what
>		 * triggered this specific report (mostly timer
> --
> 2.36.1
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf Umesh Nerlige Ramappa
@ 2023-02-17 23:37   ` Umesh Nerlige Ramappa
  2023-02-21 23:53   ` Dixit, Ashutosh
  1 sibling, 0 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17 23:37 UTC (permalink / raw)
  To: intel-gfx

On Thu, Feb 16, 2023 at 04:58:49PM -0800, Umesh Nerlige Ramappa wrote:
>Current implementation of perf defaults to render and configures the
>default OAG unit. Since there are more OA units on newer hardware, allow
>user to pass engine class and instance to program specific OA units.
>
>UMD specific changes for GPUvis support:
>https://patchwork.freedesktop.org/patch/522827/?series=114023
>https://patchwork.freedesktop.org/patch/522822/?series=114023
>https://patchwork.freedesktop.org/patch/522826/?series=114023
>https://patchwork.freedesktop.org/patch/522828/?series=114023
>https://patchwork.freedesktop.org/patch/522816/?series=114023
>https://patchwork.freedesktop.org/patch/522825/?series=114023

GPUvis PR is here - https://github.com/mikesart/gpuvis/pull/81

Regards,
Umesh

>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
> drivers/gpu/drm/i915/i915_perf.c | 49 +++++++++++++++++++-------------
> include/uapi/drm/i915_drm.h      | 20 +++++++++++++
> 2 files changed, 49 insertions(+), 20 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>index d3a1892c93be..f028df812067 100644
>--- a/drivers/gpu/drm/i915/i915_perf.c
>+++ b/drivers/gpu/drm/i915/i915_perf.c
>@@ -4035,40 +4035,29 @@ static int read_properties_unlocked(struct i915_perf *perf,
> 	struct drm_i915_gem_context_param_sseu user_sseu;
> 	u64 __user *uprop = uprops;
> 	bool config_sseu = false;
>+	u8 class, instance;
> 	u32 i;
> 	int ret;
>
> 	memset(props, 0, sizeof(struct perf_open_properties));
> 	props->poll_oa_period = DEFAULT_POLL_PERIOD_NS;
>
>-	if (!n_props) {
>-		drm_dbg(&perf->i915->drm,
>-			"No i915 perf properties given\n");
>-		return -EINVAL;
>-	}
>-
>-	/* At the moment we only support using i915-perf on the RCS. */
>-	props->engine = intel_engine_lookup_user(perf->i915,
>-						 I915_ENGINE_CLASS_RENDER,
>-						 0);
>-	if (!props->engine) {
>-		drm_dbg(&perf->i915->drm,
>-			"No RENDER-capable engines\n");
>-		return -EINVAL;
>-	}
>-
> 	/* Considering that ID = 0 is reserved and assuming that we don't
> 	 * (currently) expect any configurations to ever specify duplicate
> 	 * values for a particular property ID then the last _PROP_MAX value is
> 	 * one greater than the maximum number of properties we expect to get
> 	 * from userspace.
> 	 */
>-	if (n_props >= DRM_I915_PERF_PROP_MAX) {
>+	if (!n_props || n_props >= DRM_I915_PERF_PROP_MAX) {
> 		drm_dbg(&perf->i915->drm,
>-			"More i915 perf properties specified than exist\n");
>+			"Invalid no. of i915 perf properties given\n");
> 		return -EINVAL;
> 	}
>
>+	/* Defaults when class:instance is not passed */
>+	class = I915_ENGINE_CLASS_RENDER;
>+	instance = 0;
>+
> 	for (i = 0; i < n_props; i++) {
> 		u64 oa_period, oa_freq_hz;
> 		u64 id, value;
>@@ -4189,7 +4178,13 @@ static int read_properties_unlocked(struct i915_perf *perf,
> 			}
> 			props->poll_oa_period = value;
> 			break;
>-		case DRM_I915_PERF_PROP_MAX:
>+		case DRM_I915_PERF_PROP_OA_ENGINE_CLASS:
>+			class = (u8)value;
>+			break;
>+		case DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE:
>+			instance = (u8)value;
>+			break;
>+		default:
> 			MISSING_CASE(id);
> 			return -EINVAL;
> 		}
>@@ -4197,6 +4192,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
> 		uprop += 2;
> 	}
>
>+	props->engine = intel_engine_lookup_user(perf->i915, class, instance);
>+	if (!props->engine) {
>+		drm_dbg(&perf->i915->drm,
>+			"OA engine class and instance invalid %d:%d\n",
>+			class, instance);
>+		return -EINVAL;
>+	}
>+
>+	if (!engine_supports_oa(props->engine))
>+		return -EINVAL;
>+
> 	if (config_sseu) {
> 		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
> 		if (ret) {
>@@ -5208,8 +5214,11 @@ int i915_perf_ioctl_version(void)
> 	 *
> 	 * 5: Add DRM_I915_PERF_PROP_POLL_OA_PERIOD parameter that controls the
> 	 *    interval for the hrtimer used to check for OA data.
>+	 *
>+	 * 6: Add DRM_I915_PERF_PROP_OA_ENGINE_CLASS and
>+	 *    DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE
> 	 */
>-	return 5;
>+	return 6;
> }
>
> #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>index 8df261c5ab9b..b6922b52d85c 100644
>--- a/include/uapi/drm/i915_drm.h
>+++ b/include/uapi/drm/i915_drm.h
>@@ -2758,6 +2758,26 @@ enum drm_i915_perf_property_id {
> 	 */
> 	DRM_I915_PERF_PROP_POLL_OA_PERIOD,
>
>+	/**
>+	 * In platforms with multiple OA buffers, the engine class instance must
>+	 * be passed to open a stream to a OA unit corresponding to the engine.
>+	 * Multiple engines may be mapped to the same OA unit.
>+	 *
>+	 * In addition to the class:instance, if a gem context is also passed, then
>+	 * 1) the report headers of OA reports from other engines are squashed.
>+	 * 2) OAR is enabled for the class:instance
>+	 *
>+	 * This property is available in perf revision 6.
>+	 */
>+	DRM_I915_PERF_PROP_OA_ENGINE_CLASS,
>+
>+	/**
>+	 * This parameter specifies the engine instance.
>+	 *
>+	 * This property is available in perf revision 6.
>+	 */
>+	DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE,
>+
> 	DRM_I915_PERF_PROP_MAX /* non-ABI */
> };
>
>-- 
>2.36.1
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units Umesh Nerlige Ramappa
@ 2023-02-17 23:37   ` Umesh Nerlige Ramappa
  2023-02-23 20:05   ` Dixit, Ashutosh
  1 sibling, 0 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-17 23:37 UTC (permalink / raw)
  To: intel-gfx

On Thu, Feb 16, 2023 at 04:58:50PM -0800, Umesh Nerlige Ramappa wrote:
>MTL introduces additional OA units dedicated to media use cases. Add
>support for programming these OA units by passing the media engine class
>and instance parameters.
>
>UMD specific changes for GPUvis support:
>https://patchwork.freedesktop.org/patch/522827/?series=114023
>https://patchwork.freedesktop.org/patch/522822/?series=114023
>https://patchwork.freedesktop.org/patch/522826/?series=114023
>https://patchwork.freedesktop.org/patch/522828/?series=114023
>https://patchwork.freedesktop.org/patch/522816/?series=114023
>https://patchwork.freedesktop.org/patch/522825/?series=114023

GPUvis PR is here - https://github.com/mikesart/gpuvis/pull/81

Regards,
Umesh

>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
> drivers/gpu/drm/i915/i915_drv.h          |   2 +
> drivers/gpu/drm/i915/i915_pci.c          |   1 +
> drivers/gpu/drm/i915/i915_perf.c         | 247 ++++++++++++++++++++---
> drivers/gpu/drm/i915/i915_perf_oa_regs.h |  78 +++++++
> drivers/gpu/drm/i915/i915_perf_types.h   |  40 ++++
> drivers/gpu/drm/i915/intel_device_info.h |   1 +
> include/uapi/drm/i915_drm.h              |   4 +
> 7 files changed, 347 insertions(+), 26 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>index 0393273faa09..f3cacbf41c86 100644
>--- a/drivers/gpu/drm/i915/i915_drv.h
>+++ b/drivers/gpu/drm/i915/i915_drv.h
>@@ -856,6 +856,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
> 	(INTEL_INFO(dev_priv)->has_oa_bpc_reporting)
> #define HAS_OA_SLICE_CONTRIB_LIMITS(dev_priv) \
> 	(INTEL_INFO(dev_priv)->has_oa_slice_contrib_limits)
>+#define HAS_OAM(dev_priv) \
>+	(INTEL_INFO(dev_priv)->has_oam)
>
> /*
>  * Set this flag, when platform requires 64K GTT page sizes or larger for
>diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
>index a8d942b16223..621730b6551c 100644
>--- a/drivers/gpu/drm/i915/i915_pci.c
>+++ b/drivers/gpu/drm/i915/i915_pci.c
>@@ -1028,6 +1028,7 @@ static const struct intel_device_info adl_p_info = {
> 	.has_mslice_steering = 1, \
> 	.has_oa_bpc_reporting = 1, \
> 	.has_oa_slice_contrib_limits = 1, \
>+	.has_oam = 1, \
> 	.has_rc6 = 1, \
> 	.has_reset_engine = 1, \
> 	.has_rps = 1, \
>diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>index f028df812067..a57690f4c531 100644
>--- a/drivers/gpu/drm/i915/i915_perf.c
>+++ b/drivers/gpu/drm/i915/i915_perf.c
>@@ -192,6 +192,7 @@
>  */
>
> #include <linux/anon_inodes.h>
>+#include <linux/nospec.h>
> #include <linux/sizes.h>
> #include <linux/uuid.h>
>
>@@ -326,6 +327,13 @@ static const struct i915_oa_format oa_formats[I915_OA_FORMAT_MAX] = {
> 	[I915_OA_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 },
> 	[I915_OAR_FORMAT_A32u40_A4u32_B8_C8]    = { 5, 256 },
> 	[I915_OA_FORMAT_A24u40_A14u32_B8_C8]    = { 5, 256 },
>+	[I915_OAM_FORMAT_MPEC8u64_B8_C8]	= { 1, 192, TYPE_OAM, HDR_64_BIT },
>+	[I915_OAM_FORMAT_MPEC8u32_B8_C8]	= { 2, 128, TYPE_OAM, HDR_64_BIT },
>+};
>+
>+/* PERF_GROUP_OAG is unused for oa_base, drop it for mtl */
>+static const u32 mtl_oa_base[] = {
>+	[PERF_GROUP_OAM_SAMEDIA_0] = 0x393000,
> };
>
> #define SAMPLE_OA_REPORT      (1<<0)
>@@ -418,11 +426,17 @@ static void free_oa_config_bo(struct i915_oa_config_bo *oa_bo)
> 	kfree(oa_bo);
> }
>
>+static inline const
>+struct i915_perf_regs *__oa_regs(struct i915_perf_stream *stream)
>+{
>+	return &stream->oa_buffer.group->regs;
>+}
>+
> static u32 gen12_oa_hw_tail_read(struct i915_perf_stream *stream)
> {
> 	struct intel_uncore *uncore = stream->uncore;
>
>-	return intel_uncore_read(uncore, GEN12_OAG_OATAILPTR) &
>+	return intel_uncore_read(uncore, __oa_regs(stream)->oa_tail_ptr) &
> 	       GEN12_OAG_OATAILPTR_MASK;
> }
>
>@@ -886,7 +900,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
> 		i915_reg_t oaheadptr;
>
> 		oaheadptr = GRAPHICS_VER(stream->perf->i915) == 12 ?
>-			    GEN12_OAG_OAHEADPTR : GEN8_OAHEADPTR;
>+			    __oa_regs(stream)->oa_head_ptr :
>+			    GEN8_OAHEADPTR;
>
> 		spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>
>@@ -939,7 +954,8 @@ static int gen8_oa_read(struct i915_perf_stream *stream,
> 		return -EIO;
>
> 	oastatus_reg = GRAPHICS_VER(stream->perf->i915) == 12 ?
>-		       GEN12_OAG_OASTATUS : GEN8_OASTATUS;
>+		       __oa_regs(stream)->oa_status :
>+		       GEN8_OASTATUS;
>
> 	oastatus = intel_uncore_read(uncore, oastatus_reg);
>
>@@ -1643,16 +1659,46 @@ free_noa_wait(struct i915_perf_stream *stream)
> 	i915_vma_unpin_and_release(&stream->noa_wait, 0);
> }
>
>+/*
>+ * intel_engine_lookup_user ensures that most of engine specific checks are
>+ * taken care of, however, we can run into a case where the OA unit catering to
>+ * the engine passed by the user is disabled for some reason. In such cases,
>+ * ensure oa unit corresponding to an engine is functional. If there are no
>+ * engines in the group, the unit is disabled.
>+ */
>+static bool oa_unit_functional(const struct intel_engine_cs *engine)
>+{
>+	return engine->oa_group && engine->oa_group->num_engines;
>+}
>+
> static bool engine_supports_oa(const struct intel_engine_cs *engine)
> {
> 	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
>
> 	switch (platform) {
>+	case INTEL_METEORLAKE:
>+		return engine->class == RENDER_CLASS ||
>+		       ((engine->class == VIDEO_DECODE_CLASS ||
>+			 engine->class == VIDEO_ENHANCEMENT_CLASS) &&
>+			engine->gt->type == GT_MEDIA);
> 	default:
> 		return engine->class == RENDER_CLASS;
> 	}
> }
>
>+static bool engine_class_supports_oa_format(struct intel_engine_cs *engine, int type)
>+{
>+	switch (engine->class) {
>+	case RENDER_CLASS:
>+		return type == TYPE_OAG;
>+	case VIDEO_DECODE_CLASS:
>+	case VIDEO_ENHANCEMENT_CLASS:
>+		return type == TYPE_OAM;
>+	default:
>+		return false;
>+	}
>+}
>+
> static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
> {
> 	struct i915_perf *perf = stream->perf;
>@@ -1680,7 +1726,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
> 		drm_WARN_ON(&gt->i915->drm,
> 			    intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc));
>
>-	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
>+	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
> 	intel_engine_pm_put(stream->engine);
>
> 	if (stream->ctx)
>@@ -1804,8 +1850,8 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
>
> 	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>
>-	intel_uncore_write(uncore, GEN12_OAG_OASTATUS, 0);
>-	intel_uncore_write(uncore, GEN12_OAG_OAHEADPTR,
>+	intel_uncore_write(uncore, __oa_regs(stream)->oa_status, 0);
>+	intel_uncore_write(uncore, __oa_regs(stream)->oa_head_ptr,
> 			   gtt_offset & GEN12_OAG_OAHEADPTR_MASK);
> 	stream->oa_buffer.head = gtt_offset;
>
>@@ -1817,9 +1863,9 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
> 	 *  to enable proper functionality of the overflow
> 	 *  bit."
> 	 */
>-	intel_uncore_write(uncore, GEN12_OAG_OABUFFER, gtt_offset |
>+	intel_uncore_write(uncore, __oa_regs(stream)->oa_buffer, gtt_offset |
> 			   OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT);
>-	intel_uncore_write(uncore, GEN12_OAG_OATAILPTR,
>+	intel_uncore_write(uncore, __oa_regs(stream)->oa_tail_ptr,
> 			   gtt_offset & GEN12_OAG_OATAILPTR_MASK);
>
> 	/* Mark that we need updated tail pointers to read from... */
>@@ -2579,7 +2625,8 @@ gen8_modify_self(struct intel_context *ce,
> 	return err;
> }
>
>-static int gen8_configure_context(struct i915_gem_context *ctx,
>+static int gen8_configure_context(struct i915_perf_stream *stream,
>+				  struct i915_gem_context *ctx,
> 				  struct flex *flex, unsigned int count)
> {
> 	struct i915_gem_engines_iter it;
>@@ -2589,7 +2636,8 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
> 	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
> 		GEM_BUG_ON(ce == ce->engine->kernel_context);
>
>-		if (!engine_supports_oa(ce->engine))
>+		if (!engine_supports_oa(ce->engine) ||
>+		    ce->engine->class != stream->engine->class)
> 			continue;
>
> 		/* Otherwise OA settings will be set upon first use */
>@@ -2720,7 +2768,7 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
>
> 		spin_unlock(&i915->gem.contexts.lock);
>
>-		err = gen8_configure_context(ctx, regs, num_regs);
>+		err = gen8_configure_context(stream, ctx, regs, num_regs);
> 		if (err) {
> 			i915_gem_context_put(ctx);
> 			return err;
>@@ -2740,7 +2788,8 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
> 	for_each_uabi_engine(engine, i915) {
> 		struct intel_context *ce = engine->kernel_context;
>
>-		if (!engine_supports_oa(ce->engine))
>+		if (!engine_supports_oa(ce->engine) ||
>+		    ce->engine->class != stream->engine->class)
> 			continue;
>
> 		regs[0].value = intel_sseu_make_rpcs(engine->gt, &ce->sseu);
>@@ -2765,6 +2814,9 @@ gen12_configure_all_contexts(struct i915_perf_stream *stream,
> 		},
> 	};
>
>+	if (stream->engine->class != RENDER_CLASS)
>+		return 0;
>+
> 	return oa_configure_all_contexts(stream,
> 					 regs, ARRAY_SIZE(regs),
> 					 active);
>@@ -2894,7 +2946,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
> 				   _MASKED_BIT_ENABLE(GEN12_DISABLE_DOP_GATING));
> 	}
>
>-	intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG,
>+	intel_uncore_write(uncore, __oa_regs(stream)->oa_debug,
> 			   /* Disable clk ratio reports, like previous Gens. */
> 			   _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS |
> 					      GEN12_OAG_OA_DEBUG_INCLUDE_CLK_RATIO) |
>@@ -2904,7 +2956,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
> 			    */
> 			   oag_report_ctx_switches(stream));
>
>-	intel_uncore_write(uncore, GEN12_OAG_OAGLBCTXCTRL, periodic ?
>+	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctx_ctrl, periodic ?
> 			   (GEN12_OAG_OAGLBCTXCTRL_COUNTER_RESUME |
> 			    GEN12_OAG_OAGLBCTXCTRL_TIMER_ENABLE |
> 			    (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT))
>@@ -3058,8 +3110,8 @@ static void gen8_oa_enable(struct i915_perf_stream *stream)
>
> static void gen12_oa_enable(struct i915_perf_stream *stream)
> {
>-	struct intel_uncore *uncore = stream->uncore;
>-	u32 report_format = stream->oa_buffer.format->format;
>+	const struct i915_perf_regs *regs;
>+	u32 val;
>
> 	/*
> 	 * If we don't want OA reports from the OA buffer, then we don't even
>@@ -3070,9 +3122,11 @@ static void gen12_oa_enable(struct i915_perf_stream *stream)
>
> 	gen12_init_oa_buffer(stream);
>
>-	intel_uncore_write(uncore, GEN12_OAG_OACONTROL,
>-			   (report_format << GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT) |
>-			   GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE);
>+	regs = __oa_regs(stream);
>+	val = (stream->oa_buffer.format->format << regs->oa_ctrl_counter_format_shift) |
>+	      GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE;
>+
>+	intel_uncore_write(stream->uncore, regs->oa_ctrl, val);
> }
>
> /**
>@@ -3124,9 +3178,9 @@ static void gen12_oa_disable(struct i915_perf_stream *stream)
> {
> 	struct intel_uncore *uncore = stream->uncore;
>
>-	intel_uncore_write(uncore, GEN12_OAG_OACONTROL, 0);
>+	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctrl, 0);
> 	if (intel_wait_for_register(uncore,
>-				    GEN12_OAG_OACONTROL,
>+				    __oa_regs(stream)->oa_ctrl,
> 				    GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE, 0,
> 				    50))
> 		drm_err(&stream->perf->i915->drm,
>@@ -3329,6 +3383,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>
> 	stream->sample_size = sizeof(struct drm_i915_perf_record_header);
>
>+	stream->oa_buffer.group = g;
> 	stream->oa_buffer.format = &perf->oa_formats[props->oa_format];
> 	if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format->size == 0))
> 		return -EINVAL;
>@@ -3379,7 +3434,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
> 	 *   references will effectively disable RC6.
> 	 */
> 	intel_engine_pm_get(stream->engine);
>-	intel_uncore_forcewake_get(stream->uncore, FORCEWAKE_ALL);
>+	intel_uncore_forcewake_get(stream->uncore, g->fw_domains);
>
> 	/*
> 	 * Wa_16011777198:dg2: GuC resets render as part of the Wa. This causes
>@@ -3440,7 +3495,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
> 		intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc);
>
> err_gucrc:
>-	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
>+	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
> 	intel_engine_pm_put(stream->engine);
>
> 	free_oa_configs(stream);
>@@ -4033,6 +4088,7 @@ static int read_properties_unlocked(struct i915_perf *perf,
> 				    struct perf_open_properties *props)
> {
> 	struct drm_i915_gem_context_param_sseu user_sseu;
>+	const struct i915_oa_format *f;
> 	u64 __user *uprop = uprops;
> 	bool config_sseu = false;
> 	u8 class, instance;
>@@ -4203,6 +4259,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
> 	if (!engine_supports_oa(props->engine))
> 		return -EINVAL;
>
>+	if (!oa_unit_functional(props->engine))
>+		return -ENODEV;
>+
>+	i = array_index_nospec(props->oa_format, I915_OA_FORMAT_MAX);
>+	f = &perf->oa_formats[i];
>+	if (!engine_class_supports_oa_format(props->engine, f->type)) {
>+		DRM_DEBUG("Invalid OA format %d for class %d\n",
>+			  f->type, props->engine->class);
>+		return -EINVAL;
>+	}
>+
> 	if (config_sseu) {
> 		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
> 		if (ret) {
>@@ -4383,6 +4450,14 @@ static const struct i915_range gen12_oa_b_counters[] = {
> 	{}
> };
>
>+static const struct i915_range mtl_oam_b_counters[] = {
>+	{ .start = 0x393000, .end = 0x39301c },	/* GEN12_OAM_STARTTRIG1[1-8] */
>+	{ .start = 0x393020, .end = 0x39303c },	/* GEN12_OAM_REPORTTRIG1[1-8] */
>+	{ .start = 0x393040, .end = 0x39307c },	/* GEN12_OAM_CEC[0-7][0-1] */
>+	{ .start = 0x393200, .end = 0x39323C },	/* MPES[0-7] */
>+	{}
>+};
>+
> static const struct i915_range xehp_oa_b_counters[] = {
> 	{ .start = 0xdc48, .end = 0xdc48 },	/* OAA_ENABLE_REG */
> 	{ .start = 0xdd00, .end = 0xdd48 },	/* OAG_LCE0_0 - OAA_LENABLE_REG */
>@@ -4429,13 +4504,16 @@ static const struct i915_range gen12_oa_mux_regs[] = {
>
> /*
>  * Ref: 14010536224:
>- * 0x20cc is repurposed on MTL, so use a separate array for MTL.
>+ * 0x20cc is repurposed on MTL, so use a separate array for MTL. Also add the
>+ * MPES/MPEC registers.
>  */
> static const struct i915_range mtl_oa_mux_regs[] = {
> 	{ .start = 0x0d00, .end = 0x0d04 },	/* RPM_CONFIG[0-1] */
> 	{ .start = 0x0d0c, .end = 0x0d2c },	/* NOA_CONFIG[0-8] */
> 	{ .start = 0x9840, .end = 0x9840 },	/* GDT_CHICKEN_BITS */
> 	{ .start = 0x9884, .end = 0x9888 },	/* NOA_WRITE */
>+	{ .start = 0x38d100, .end = 0x38d114},	/* VISACTL */
>+	{}
> };
>
> static bool gen7_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
>@@ -4473,10 +4551,26 @@ static bool gen12_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
> 	return reg_in_range_table(addr, gen12_oa_b_counters);
> }
>
>+static bool xehp_is_valid_oam_b_counter_addr(struct i915_perf *perf, u32 addr)
>+{
>+	enum intel_platform platform = INTEL_INFO(perf->i915)->platform;
>+
>+	if (!HAS_OAM(perf->i915))
>+		return false;
>+
>+	switch (platform) {
>+	case INTEL_METEORLAKE:
>+		return reg_in_range_table(addr, mtl_oam_b_counters);
>+	default:
>+		return false;
>+	}
>+}
>+
> static bool xehp_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
> {
> 	return reg_in_range_table(addr, xehp_oa_b_counters) ||
>-		reg_in_range_table(addr, gen12_oa_b_counters);
>+		reg_in_range_table(addr, gen12_oa_b_counters) ||
>+		xehp_is_valid_oam_b_counter_addr(perf, addr);
> }
>
> static bool gen12_is_valid_mux_addr(struct i915_perf *perf, u32 addr)
>@@ -4846,11 +4940,39 @@ static u32 __num_perf_groups_per_gt(struct intel_gt *gt)
> 	enum intel_platform platform = INTEL_INFO(gt->i915)->platform;
>
> 	switch (platform) {
>+	case INTEL_METEORLAKE:
>+		return 1;
> 	default:
> 		return 1;
> 	}
> }
>
>+static u32 __oam_engine_group(struct intel_engine_cs *engine)
>+{
>+	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
>+	struct intel_gt *gt = engine->gt;
>+	u32 group = PERF_GROUP_INVALID;
>+
>+	switch (platform) {
>+	case INTEL_METEORLAKE:
>+		/*
>+		 * There's 1 SAMEDIA gt and 1 OAM per SAMEDIA gt. All media slices
>+		 * within the gt use the same OAM. All MTL SKUs list 1 SA MEDIA.
>+		 */
>+		drm_WARN_ON(&engine->i915->drm,
>+			    engine->gt->type != GT_MEDIA);
>+
>+		group = PERF_GROUP_OAM_SAMEDIA_0;
>+		break;
>+	default:
>+		break;
>+	}
>+
>+	drm_WARN_ON(&gt->i915->drm, group >= __num_perf_groups_per_gt(gt));
>+
>+	return group;
>+}
>+
> static u32 __oa_engine_group(struct intel_engine_cs *engine)
> {
> 	if (!engine_supports_oa(engine))
>@@ -4860,11 +4982,58 @@ static u32 __oa_engine_group(struct intel_engine_cs *engine)
> 	case RENDER_CLASS:
> 		return PERF_GROUP_OAG;
>
>+	case VIDEO_DECODE_CLASS:
>+	case VIDEO_ENHANCEMENT_CLASS:
>+		return __oam_engine_group(engine);
>+
> 	default:
> 		return PERF_GROUP_INVALID;
> 	}
> }
>
>+static struct i915_perf_regs __oam_regs(u32 base)
>+{
>+	return (struct i915_perf_regs) {
>+		base,
>+		GEN12_OAM_HEAD_POINTER(base),
>+		GEN12_OAM_TAIL_POINTER(base),
>+		GEN12_OAM_BUFFER(base),
>+		GEN12_OAM_CONTEXT_CONTROL(base),
>+		GEN12_OAM_CONTROL(base),
>+		GEN12_OAM_DEBUG(base),
>+		GEN12_OAM_STATUS(base),
>+		GEN12_OAM_CONTROL_COUNTER_FORMAT_SHIFT,
>+	};
>+}
>+
>+static struct i915_perf_regs __oag_regs(void)
>+{
>+	return (struct i915_perf_regs) {
>+		0,
>+		GEN12_OAG_OAHEADPTR,
>+		GEN12_OAG_OATAILPTR,
>+		GEN12_OAG_OABUFFER,
>+		GEN12_OAG_OAGLBCTXCTRL,
>+		GEN12_OAG_OACONTROL,
>+		GEN12_OAG_OA_DEBUG,
>+		GEN12_OAG_OASTATUS,
>+		GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT,
>+	};
>+}
>+
>+static void oa_init_regs(struct intel_gt *gt, u32 id)
>+{
>+	struct i915_perf_group *group = &gt->perf.group[id];
>+	struct i915_perf_regs *regs = &group->regs;
>+
>+	if (id == PERF_GROUP_OAG && gt->type != GT_MEDIA)
>+		*regs = __oag_regs();
>+	else if (IS_METEORLAKE(gt->i915))
>+		*regs = __oam_regs(mtl_oa_base[id]);
>+	else
>+		drm_WARN(&gt->i915->drm, 1, "Unsupported platform for OA\n");
>+}
>+
> static void oa_init_groups(struct intel_gt *gt)
> {
> 	int i, num_groups = gt->perf.num_perf_groups;
>@@ -4881,6 +5050,24 @@ static void oa_init_groups(struct intel_gt *gt)
> 		g->oa_unit_id = perf->oa_unit_ids++;
>
> 		g->gt = gt;
>+		oa_init_regs(gt, i);
>+		g->fw_domains = FORCEWAKE_ALL;
>+		if (i == PERF_GROUP_OAG) {
>+			g->type = TYPE_OAG;
>+
>+			/*
>+			 * Enabling all fw domains for OAG caps the max GT
>+			 * frequency to media FF max. This could be less than
>+			 * what the user sets through the sysfs and perf
>+			 * measurements could be skewed. Since some platforms
>+			 * have separate OAM units to measure media perf, do not
>+			 * enable media fw domains for OAG.
>+			 */
>+			if (HAS_OAM(gt->i915))
>+				g->fw_domains = FORCEWAKE_GT | FORCEWAKE_RENDER;
>+		} else {
>+			g->type = TYPE_OAM;
>+		}
> 	}
> }
>
>@@ -4970,9 +5157,15 @@ static void oa_init_supported_formats(struct i915_perf *perf)
> 		break;
>
> 	case INTEL_DG2:
>+		oa_format_add(perf, I915_OAR_FORMAT_A32u40_A4u32_B8_C8);
>+		oa_format_add(perf, I915_OA_FORMAT_A24u40_A14u32_B8_C8);
>+		break;
>+
> 	case INTEL_METEORLAKE:
> 		oa_format_add(perf, I915_OAR_FORMAT_A32u40_A4u32_B8_C8);
> 		oa_format_add(perf, I915_OA_FORMAT_A24u40_A14u32_B8_C8);
>+		oa_format_add(perf, I915_OAM_FORMAT_MPEC8u64_B8_C8);
>+		oa_format_add(perf, I915_OAM_FORMAT_MPEC8u32_B8_C8);
> 		break;
>
> 	default:
>@@ -5217,8 +5410,10 @@ int i915_perf_ioctl_version(void)
> 	 *
> 	 * 6: Add DRM_I915_PERF_PROP_OA_ENGINE_CLASS and
> 	 *    DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE
>+	 *
>+	 * 7: Add support for video decode and enhancement classes.
> 	 */
>-	return 6;
>+	return 7;
> }
>
> #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>diff --git a/drivers/gpu/drm/i915/i915_perf_oa_regs.h b/drivers/gpu/drm/i915/i915_perf_oa_regs.h
>index 381d94101610..ba103875e19f 100644
>--- a/drivers/gpu/drm/i915/i915_perf_oa_regs.h
>+++ b/drivers/gpu/drm/i915/i915_perf_oa_regs.h
>@@ -138,4 +138,82 @@
> #define   GEN12_SQCNT1_PMON_ENABLE		REG_BIT(30)
> #define   GEN12_SQCNT1_OABPC			REG_BIT(29)
>
>+/* Gen12 OAM unit */
>+#define GEN12_OAM_HEAD_POINTER_OFFSET   (0x1a0)
>+#define  GEN12_OAM_HEAD_POINTER_MASK    0xffffffc0
>+
>+#define GEN12_OAM_TAIL_POINTER_OFFSET   (0x1a4)
>+#define  GEN12_OAM_TAIL_POINTER_MASK    0xffffffc0
>+
>+#define GEN12_OAM_BUFFER_OFFSET         (0x1a8)
>+#define  GEN12_OAM_BUFFER_SIZE_MASK     (0x7)
>+#define  GEN12_OAM_BUFFER_SIZE_SHIFT    (3)
>+#define  GEN12_OAM_BUFFER_MEMORY_SELECT REG_BIT(0) /* 0: PPGTT, 1: GGTT */
>+
>+#define GEN12_OAM_CONTEXT_CONTROL_OFFSET              (0x1bc)
>+#define  GEN12_OAM_CONTEXT_CONTROL_TIMER_PERIOD_SHIFT 2
>+#define  GEN12_OAM_CONTEXT_CONTROL_TIMER_ENABLE       REG_BIT(1)
>+#define  GEN12_OAM_CONTEXT_CONTROL_COUNTER_RESUME     REG_BIT(0)
>+
>+#define GEN12_OAM_CONTROL_OFFSET                (0x194)
>+#define  GEN12_OAM_CONTROL_COUNTER_FORMAT_SHIFT 1
>+#define  GEN12_OAM_CONTROL_COUNTER_ENABLE       REG_BIT(0)
>+
>+#define GEN12_OAM_DEBUG_OFFSET                      (0x198)
>+#define  GEN12_OAM_DEBUG_BUFFER_SIZE_SELECT         REG_BIT(12)
>+#define  GEN12_OAM_DEBUG_INCLUDE_CLK_RATIO          REG_BIT(6)
>+#define  GEN12_OAM_DEBUG_DISABLE_CLK_RATIO_REPORTS  REG_BIT(5)
>+#define  GEN12_OAM_DEBUG_DISABLE_GO_1_0_REPORTS     REG_BIT(2)
>+#define  GEN12_OAM_DEBUG_DISABLE_CTX_SWITCH_REPORTS REG_BIT(1)
>+
>+#define GEN12_OAM_STATUS_OFFSET            (0x19c)
>+#define  GEN12_OAM_STATUS_COUNTER_OVERFLOW REG_BIT(2)
>+#define  GEN12_OAM_STATUS_BUFFER_OVERFLOW  REG_BIT(1)
>+#define  GEN12_OAM_STATUS_REPORT_LOST      REG_BIT(0)
>+
>+#define GEN12_OAM_MMIO_TRG_OFFSET	(0x1d0)
>+
>+#define GEN12_OAM_MMIO_TRG(base) \
>+	_MMIO((base) + GEN12_OAM_MMIO_TRG_OFFSET)
>+
>+#define GEN12_OAM_HEAD_POINTER(base) \
>+	_MMIO((base) + GEN12_OAM_HEAD_POINTER_OFFSET)
>+#define GEN12_OAM_TAIL_POINTER(base) \
>+	_MMIO((base) + GEN12_OAM_TAIL_POINTER_OFFSET)
>+#define GEN12_OAM_BUFFER(base) \
>+	_MMIO((base) + GEN12_OAM_BUFFER_OFFSET)
>+#define GEN12_OAM_CONTEXT_CONTROL(base) \
>+	_MMIO((base) + GEN12_OAM_CONTEXT_CONTROL_OFFSET)
>+#define GEN12_OAM_CONTROL(base) \
>+	_MMIO((base) + GEN12_OAM_CONTROL_OFFSET)
>+#define GEN12_OAM_DEBUG(base) \
>+	_MMIO((base) + GEN12_OAM_DEBUG_OFFSET)
>+#define GEN12_OAM_STATUS(base) \
>+	_MMIO((base) + GEN12_OAM_STATUS_OFFSET)
>+
>+#define GEN12_OAM_CEC0_0_OFFSET		(0x40)
>+#define GEN12_OAM_CEC7_1_OFFSET		(0x7c)
>+#define GEN12_OAM_CEC0_0(base) \
>+	_MMIO((base) + GEN12_OAM_CEC0_0_OFFSET)
>+#define GEN12_OAM_CEC7_1(base) \
>+	_MMIO((base) + GEN12_OAM_CEC7_1_OFFSET)
>+
>+#define GEN12_OAM_STARTTRIG1_OFFSET	(0x00)
>+#define GEN12_OAM_STARTTRIG8_OFFSET	(0x1c)
>+#define GEN12_OAM_STARTTRIG1(base) \
>+	_MMIO((base) + GEN12_OAM_STARTTRIG1_OFFSET)
>+#define GEN12_OAM_STARTTRIG8(base) \
>+	_MMIO((base) + GEN12_OAM_STARTTRIG8_OFFSET)
>+
>+#define GEN12_OAM_REPORTTRIG1_OFFSET	(0x20)
>+#define GEN12_OAM_REPORTTRIG8_OFFSET	(0x3c)
>+#define GEN12_OAM_REPORTTRIG1(base) \
>+	_MMIO((base) + GEN12_OAM_REPORTTRIG1_OFFSET)
>+#define GEN12_OAM_REPORTTRIG8(base) \
>+	_MMIO((base) + GEN12_OAM_REPORTTRIG8_OFFSET)
>+
>+#define GEN12_OAM_PERF_COUNTER_B0_OFFSET	(0x84)
>+#define GEN12_OAM_PERF_COUNTER_B(base, idx) \
>+	_MMIO((base) + GEN12_OAM_PERF_COUNTER_B0_OFFSET + 4 * (idx))
>+
> #endif /* __INTEL_PERF_OA_REGS__ */
>diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
>index 8ccb0b89d019..5b2c3bab60f8 100644
>--- a/drivers/gpu/drm/i915/i915_perf_types.h
>+++ b/drivers/gpu/drm/i915/i915_perf_types.h
>@@ -20,6 +20,7 @@
> #include "gt/intel_engine_types.h"
> #include "gt/intel_sseu.h"
> #include "i915_reg_defs.h"
>+#include "intel_uncore.h"
> #include "intel_wakeref.h"
>
> struct drm_i915_private;
>@@ -33,6 +34,7 @@ struct intel_engine_cs;
>
> enum {
> 	PERF_GROUP_OAG = 0,
>+	PERF_GROUP_OAM_SAMEDIA_0 = 0,
>
> 	PERF_GROUP_MAX,
> 	PERF_GROUP_INVALID = U32_MAX,
>@@ -43,9 +45,27 @@ enum report_header {
> 	HDR_64_BIT,
> };
>
>+struct i915_perf_regs {
>+	u32 base;
>+	i915_reg_t oa_head_ptr;
>+	i915_reg_t oa_tail_ptr;
>+	i915_reg_t oa_buffer;
>+	i915_reg_t oa_ctx_ctrl;
>+	i915_reg_t oa_ctrl;
>+	i915_reg_t oa_debug;
>+	i915_reg_t oa_status;
>+	u32 oa_ctrl_counter_format_shift;
>+};
>+
>+enum {
>+	TYPE_OAG,
>+	TYPE_OAM,
>+};
>+
> struct i915_oa_format {
> 	u32 format;
> 	int size;
>+	int type;
> 	enum report_header header;
> };
>
>@@ -317,6 +337,11 @@ struct i915_perf_stream {
> 		 * @tail: The last verified tail that can be read by userspace.
> 		 */
> 		u32 tail;
>+
>+		/**
>+		 * @group: The group object for this OA buffer.
>+		 */
>+		struct i915_perf_group *group;
> 	} oa_buffer;
>
> 	/**
>@@ -431,6 +456,21 @@ struct i915_perf_group {
> 	 * @engine_mask: A mask of engines using a single OA buffer.
> 	 */
> 	intel_engine_mask_t engine_mask;
>+
>+	/*
>+	 * @regs: OA buffer register group for programming the OA unit.
>+	 */
>+	struct i915_perf_regs regs;
>+
>+	/*
>+	 * @type: Type of OA buffer, OAM, OAG etc.
>+	 */
>+	int type;
>+
>+	/*
>+	 * @fw_domains: forcewake domains required for this group.
>+	 */
>+	enum forcewake_domains fw_domains;
> };
>
> struct i915_perf_gt {
>diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
>index 80bda653d61b..45e218327f44 100644
>--- a/drivers/gpu/drm/i915/intel_device_info.h
>+++ b/drivers/gpu/drm/i915/intel_device_info.h
>@@ -166,6 +166,7 @@ enum intel_ppgtt_type {
> 	func(has_mslice_steering); \
> 	func(has_oa_bpc_reporting); \
> 	func(has_oa_slice_contrib_limits); \
>+	func(has_oam); \
> 	func(has_one_eu_per_fuse_bit); \
> 	func(has_pxp); \
> 	func(has_rc6); \
>diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>index b6922b52d85c..70bfa6530dbc 100644
>--- a/include/uapi/drm/i915_drm.h
>+++ b/include/uapi/drm/i915_drm.h
>@@ -2676,6 +2676,10 @@ enum drm_i915_oa_format {
> 	I915_OAR_FORMAT_A32u40_A4u32_B8_C8,
> 	I915_OA_FORMAT_A24u40_A14u32_B8_C8,
>
>+	/* MTL OAM */
>+	I915_OAM_FORMAT_MPEC8u64_B8_C8,
>+	I915_OAM_FORMAT_MPEC8u32_B8_C8,
>+
> 	I915_OA_FORMAT_MAX	    /* non-ABI */
> };
>
>-- 
>2.36.1
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports
  2023-02-17 20:58   ` Dixit, Ashutosh
@ 2023-02-18  0:05     ` Umesh Nerlige Ramappa
  2023-02-18  1:57       ` Dixit, Ashutosh
  0 siblings, 1 reply; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-18  0:05 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

On Fri, Feb 17, 2023 at 12:58:18PM -0800, Dixit, Ashutosh wrote:
>On Thu, 16 Feb 2023 16:58:48 -0800, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh, couple of nits below.
>
>> Some of the newer OA formats are not powers of 2. For those formats,
>> adjust the hw_tail accordingly when checking for new reports.
>>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_perf.c | 50 ++++++++++++++++++--------------
>>  1 file changed, 28 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index 9715b964aa1e..d3a1892c93be 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -542,6 +542,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>	bool pollin;
>>	u32 hw_tail;
>>	u64 now;
>> +	u32 partial_report_size;
>>
>>	/* We have to consider the (unlikely) possibility that read() errors
>>	 * could result in an OA buffer reset which might reset the head and
>> @@ -551,10 +552,16 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>>
>>	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
>>
>> -	/* The tail pointer increases in 64 byte increments,
>> -	 * not in report_size steps...
>> +	/* The tail pointer increases in 64 byte increments, whereas report
>> +	 * sizes need not be integral multiples or 64 or powers of 2.
>s/or/of/ ---------------------------------------^
>
>Also I think report sizes can only be multiples of 64, the ones we have
>seen till now definitely are. Also the lower 6 bits of tail pointer are 0.

Agree, the only addition to the old comment should be that the new 
reports may not be powers of 2.

>
>> +	 * Compute potentially partially landed report in the OA buffer
>>	 */
>> -	hw_tail &= ~(report_size - 1);
>> +	partial_report_size = OA_TAKEN(hw_tail, stream->oa_buffer.tail);
>> +	partial_report_size %= report_size;
>> +
>> +	/* Subtract partial amount off the tail */
>> +	hw_tail = gtt_offset + ((hw_tail - partial_report_size) &
>> +				(stream->oa_buffer.vma->size - 1));
>
>Couple of questions here because OA_TAKEN uses OA_BUFFER_SIZE and the above
>expression uses stream->oa_buffer.vma->size:
>
>1. Is 'OA_BUFFER_SIZE == stream->oa_buffer.vma->size'? We seem to be using
>   the two interchaneably in the code.

Yes. I think the code was updated to use vma->size when support for 
selecting OA buffer size along with large OA buffers was added, but we 
haven't pushed that upstream yet. Since I have cherry-picked patches 
here, there is some inconsistency. I would just change this patch to use 
OA_BUFFER_SIZE for now.

>2. If yes, can we add an assert about this in alloc_oa_buffer?

If I change to OA_BUFFER_SIZE, then okay to skip assert? Do you suspect 
that the vma size may actually differ from what we requested?

>3. Can the above expression be changed to:
>
>	hw_tail = gtt_offset + OA_TAKEN(hw_tail, partial_report_size);

Not if hw_tail has rolled over, but stream->oa_buffer.tail hasn't.

>
>It would be good to use the same construct if possible. Maybe we can even
>change OA_TAKEN to something like:
>
>#define OA_TAKEN(tail, head)    ((tail - head) & (stream->oa_buffer.vma->size - 1))

I am thinking of changing to OA_BUFFER_SIZE at other places in this 
patch. Thoughts?

>
>>
>>	now = ktime_get_mono_fast_ns();
>>
>> @@ -677,6 +684,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>>  {
>>	int report_size = stream->oa_buffer.format->size;
>>	struct drm_i915_perf_record_header header;
>> +	int report_size_partial;
>> +	u8 *oa_buf_end;
>>
>>	header.type = DRM_I915_PERF_RECORD_SAMPLE;
>>	header.pad = 0;
>> @@ -690,8 +699,21 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>>		return -EFAULT;
>>	buf += sizeof(header);
>>
>> -	if (copy_to_user(buf, report, report_size))
>> +	oa_buf_end = stream->oa_buffer.vaddr +
>> +		     stream->oa_buffer.vma->size;
>> +	report_size_partial = oa_buf_end - report;
>> +
>> +	if (report_size_partial < report_size) {
>> +		if (copy_to_user(buf, report, report_size_partial))
>> +			return -EFAULT;
>> +		buf += report_size_partial;
>> +
>> +		if (copy_to_user(buf, stream->oa_buffer.vaddr,
>> +				 report_size - report_size_partial))
>> +			return -EFAULT;
>> +	} else if (copy_to_user(buf, report, report_size)) {
>>		return -EFAULT;
>> +	}
>>
>>	(*offset) += header.size;
>>
>> @@ -759,8 +781,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
>>	 * all a power of two).
>>	 */
>>	if (drm_WARN_ONCE(&uncore->i915->drm,
>> -			  head > OA_BUFFER_SIZE || head % report_size ||
>> -			  tail > OA_BUFFER_SIZE || tail % report_size,
>> +			  head > OA_BUFFER_SIZE ||
>> +			  tail > OA_BUFFER_SIZE,
>
>The comment above the if () also needs to be fixed.

Will fix

>
>Also, does it make sense to have 'head % 64 || tail % 64' checks above? As
>I was saying above head and tail will be 64 byte aligned.

Since head and tail are derived from HW registers and the HW only 
increments them by 64, we should be good even without the %64.

Thanks,
Umesh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports
  2023-02-18  0:05     ` Umesh Nerlige Ramappa
@ 2023-02-18  1:57       ` Dixit, Ashutosh
  2023-02-21 18:51         ` Dixit, Ashutosh
  0 siblings, 1 reply; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-18  1:57 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Fri, 17 Feb 2023 16:05:50 -0800, Umesh Nerlige Ramappa wrote:
> On Fri, Feb 17, 2023 at 12:58:18PM -0800, Dixit, Ashutosh wrote:
> > On Thu, 16 Feb 2023 16:58:48 -0800, Umesh Nerlige Ramappa wrote:
> >>
> >
> > Hi Umesh, couple of nits below.
> >
> >> Some of the newer OA formats are not powers of 2. For those formats,
> >> adjust the hw_tail accordingly when checking for new reports.
> >>
> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_perf.c | 50 ++++++++++++++++++--------------
> >>  1 file changed, 28 insertions(+), 22 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> >> index 9715b964aa1e..d3a1892c93be 100644
> >> --- a/drivers/gpu/drm/i915/i915_perf.c
> >> +++ b/drivers/gpu/drm/i915/i915_perf.c
> >> @@ -542,6 +542,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> >>	bool pollin;
> >>	u32 hw_tail;
> >>	u64 now;
> >> +	u32 partial_report_size;
> >>
> >>	/* We have to consider the (unlikely) possibility that read() errors
> >>	 * could result in an OA buffer reset which might reset the head and
> >> @@ -551,10 +552,16 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> >>
> >>	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
> >>
> >> -	/* The tail pointer increases in 64 byte increments,
> >> -	 * not in report_size steps...
> >> +	/* The tail pointer increases in 64 byte increments, whereas report
> >> +	 * sizes need not be integral multiples or 64 or powers of 2.
> > s/or/of/ ---------------------------------------^
> >
> > Also I think report sizes can only be multiples of 64, the ones we have
> > seen till now definitely are. Also the lower 6 bits of tail pointer are 0.
>
> Agree, the only addition to the old comment should be that the new reports
> may not be powers of 2.
>
> >
> >> +	 * Compute potentially partially landed report in the OA buffer
> >>	 */
> >> -	hw_tail &= ~(report_size - 1);
> >> +	partial_report_size = OA_TAKEN(hw_tail, stream->oa_buffer.tail);
> >> +	partial_report_size %= report_size;
> >> +
> >> +	/* Subtract partial amount off the tail */
> >> +	hw_tail = gtt_offset + ((hw_tail - partial_report_size) &
> >> +				(stream->oa_buffer.vma->size - 1));
> >
> > Couple of questions here because OA_TAKEN uses OA_BUFFER_SIZE and the above
> > expression uses stream->oa_buffer.vma->size:
> >
> > 1. Is 'OA_BUFFER_SIZE == stream->oa_buffer.vma->size'? We seem to be using
> >   the two interchaneably in the code.
>
> Yes. I think the code was updated to use vma->size when support for
> selecting OA buffer size along with large OA buffers was added, but we
> haven't pushed that upstream yet. Since I have cherry-picked patches here,
> there is some inconsistency. I would just change this patch to use
> OA_BUFFER_SIZE for now.
>
> > 2. If yes, can we add an assert about this in alloc_oa_buffer?
>
> If I change to OA_BUFFER_SIZE, then okay to skip assert?

Yes.

> Do you suspect that the vma size may actually differ from what we
> requested?

Not sure how shmem objects are allocated but my guess would be that for a
nice whole size like 16 M they the vma size will be the same. So ok to just
use OA_BUFFER_SIZE in a couple of places in this patch and skip the
assert. As long as vma_size >= OA_BUFFER_SIZE we are ok.

>
> > 3. Can the above expression be changed to:
> >
> >	hw_tail = gtt_offset + OA_TAKEN(hw_tail, partial_report_size);
>
> Not if hw_tail has rolled over, but stream->oa_buffer.tail hasn't.

Why not, the two expressions are exactly the same? And anyway
stream->oa_buffer.tail is already handled in the first OA_TAKEN expression.

>
> >
> > It would be good to use the same construct if possible. Maybe we can even
> > change OA_TAKEN to something like:
> >
> > #define OA_TAKEN(tail, head)    ((tail - head) & (stream->oa_buffer.vma->size - 1))
>
> I am thinking of changing to OA_BUFFER_SIZE at other places in this
> patch. Thoughts?

Sure, let's do that, there are just 2 places.

> >
> >>
> >>	now = ktime_get_mono_fast_ns();
> >>
> >> @@ -677,6 +684,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
> >>  {
> >>	int report_size = stream->oa_buffer.format->size;
> >>	struct drm_i915_perf_record_header header;
> >> +	int report_size_partial;
> >> +	u8 *oa_buf_end;
> >>
> >>	header.type = DRM_I915_PERF_RECORD_SAMPLE;
> >>	header.pad = 0;
> >> @@ -690,8 +699,21 @@ static int append_oa_sample(struct i915_perf_stream *stream,
> >>		return -EFAULT;
> >>	buf += sizeof(header);
> >>
> >> -	if (copy_to_user(buf, report, report_size))
> >> +	oa_buf_end = stream->oa_buffer.vaddr +
> >> +		     stream->oa_buffer.vma->size;
> >> +	report_size_partial = oa_buf_end - report;
> >> +
> >> +	if (report_size_partial < report_size) {
> >> +		if (copy_to_user(buf, report, report_size_partial))
> >> +			return -EFAULT;
> >> +		buf += report_size_partial;
> >> +
> >> +		if (copy_to_user(buf, stream->oa_buffer.vaddr,
> >> +				 report_size - report_size_partial))
> >> +			return -EFAULT;
> >> +	} else if (copy_to_user(buf, report, report_size)) {
> >>		return -EFAULT;
> >> +	}
> >>
> >>	(*offset) += header.size;
> >>
> >> @@ -759,8 +781,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
> >>	 * all a power of two).
> >>	 */
> >>	if (drm_WARN_ONCE(&uncore->i915->drm,
> >> -			  head > OA_BUFFER_SIZE || head % report_size ||
> >> -			  tail > OA_BUFFER_SIZE || tail % report_size,
> >> +			  head > OA_BUFFER_SIZE ||
> >> +			  tail > OA_BUFFER_SIZE,
> >
> > The comment above the if () also needs to be fixed.
>
> Will fix
>
> >
> > Also, does it make sense to have 'head % 64 || tail % 64' checks above? As
> > I was saying above head and tail will be 64 byte aligned.
>
> Since head and tail are derived from HW registers and the HW only
> increments them by 64, we should be good even without the %64.

OK.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports
  2023-02-18  1:57       ` Dixit, Ashutosh
@ 2023-02-21 18:51         ` Dixit, Ashutosh
  2023-02-24 19:12           ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-21 18:51 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Fri, 17 Feb 2023 17:57:02 -0800, Dixit, Ashutosh wrote:
>
> On Fri, 17 Feb 2023 16:05:50 -0800, Umesh Nerlige Ramappa wrote:
> > On Fri, Feb 17, 2023 at 12:58:18PM -0800, Dixit, Ashutosh wrote:
> > > On Thu, 16 Feb 2023 16:58:48 -0800, Umesh Nerlige Ramappa wrote:
> > >>
> > >
> > > Hi Umesh, couple of nits below.
> > >
> > >> Some of the newer OA formats are not powers of 2. For those formats,
> > >> adjust the hw_tail accordingly when checking for new reports.
> > >>
> > >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> > >> ---
> > >>  drivers/gpu/drm/i915/i915_perf.c | 50 ++++++++++++++++++--------------
> > >>  1 file changed, 28 insertions(+), 22 deletions(-)
> > >>
> > >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > >> index 9715b964aa1e..d3a1892c93be 100644
> > >> --- a/drivers/gpu/drm/i915/i915_perf.c
> > >> +++ b/drivers/gpu/drm/i915/i915_perf.c
> > >> @@ -542,6 +542,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> > >>	bool pollin;
> > >>	u32 hw_tail;
> > >>	u64 now;
> > >> +	u32 partial_report_size;
> > >>
> > >>	/* We have to consider the (unlikely) possibility that read() errors
> > >>	 * could result in an OA buffer reset which might reset the head and
> > >> @@ -551,10 +552,16 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> > >>
> > >>	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
> > >>
> > >> -	/* The tail pointer increases in 64 byte increments,
> > >> -	 * not in report_size steps...
> > >> +	/* The tail pointer increases in 64 byte increments, whereas report
> > >> +	 * sizes need not be integral multiples or 64 or powers of 2.
> > > s/or/of/ ---------------------------------------^
> > >
> > > Also I think report sizes can only be multiples of 64, the ones we have
> > > seen till now definitely are. Also the lower 6 bits of tail pointer are 0.
> >
> > Agree, the only addition to the old comment should be that the new reports
> > may not be powers of 2.
> >
> > >
> > >> +	 * Compute potentially partially landed report in the OA buffer
> > >>	 */
> > >> -	hw_tail &= ~(report_size - 1);
> > >> +	partial_report_size = OA_TAKEN(hw_tail, stream->oa_buffer.tail);
> > >> +	partial_report_size %= report_size;
> > >> +
> > >> +	/* Subtract partial amount off the tail */
> > >> +	hw_tail = gtt_offset + ((hw_tail - partial_report_size) &
> > >> +				(stream->oa_buffer.vma->size - 1));
> > >
> > > Couple of questions here because OA_TAKEN uses OA_BUFFER_SIZE and the above
> > > expression uses stream->oa_buffer.vma->size:
> > >
> > > 1. Is 'OA_BUFFER_SIZE == stream->oa_buffer.vma->size'? We seem to be using
> > >   the two interchaneably in the code.
> >
> > Yes. I think the code was updated to use vma->size when support for
> > selecting OA buffer size along with large OA buffers was added, but we
> > haven't pushed that upstream yet. Since I have cherry-picked patches here,
> > there is some inconsistency. I would just change this patch to use
> > OA_BUFFER_SIZE for now.
> >
> > > 2. If yes, can we add an assert about this in alloc_oa_buffer?
> >
> > If I change to OA_BUFFER_SIZE, then okay to skip assert?
>
> Yes.
>
> > Do you suspect that the vma size may actually differ from what we
> > requested?
>
> Not sure how shmem objects are allocated but my guess would be that for a
> nice whole size like 16 M they the vma size will be the same. So ok to just
> use OA_BUFFER_SIZE in a couple of places in this patch and skip the
> assert. As long as vma_size >= OA_BUFFER_SIZE we are ok.
>
> >
> > > 3. Can the above expression be changed to:
> > >
> > >	hw_tail = gtt_offset + OA_TAKEN(hw_tail, partial_report_size);
> >
> > Not if hw_tail has rolled over, but stream->oa_buffer.tail hasn't.
>
> Why not, the two expressions are exactly the same? And anyway
> stream->oa_buffer.tail is already handled in the first OA_TAKEN expression.

Basically, for me OA_TAKEN is a "circular diff" (for a power-of-2 sized
circular buffer), so anywhere we have these "circular diff" opereations we
should be able to replace them by OA_TAKEN.

> > >
> > > It would be good to use the same construct if possible. Maybe we can even
> > > change OA_TAKEN to something like:
> > >
> > > #define OA_TAKEN(tail, head)    ((tail - head) & (stream->oa_buffer.vma->size - 1))
> >
> > I am thinking of changing to OA_BUFFER_SIZE at other places in this
> > patch. Thoughts?
>
> Sure, let's do that, there are just 2 places.
>
> > >
> > >>
> > >>	now = ktime_get_mono_fast_ns();
> > >>
> > >> @@ -677,6 +684,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
> > >>  {
> > >>	int report_size = stream->oa_buffer.format->size;
> > >>	struct drm_i915_perf_record_header header;
> > >> +	int report_size_partial;
> > >> +	u8 *oa_buf_end;
> > >>
> > >>	header.type = DRM_I915_PERF_RECORD_SAMPLE;
> > >>	header.pad = 0;
> > >> @@ -690,8 +699,21 @@ static int append_oa_sample(struct i915_perf_stream *stream,
> > >>		return -EFAULT;
> > >>	buf += sizeof(header);
> > >>
> > >> -	if (copy_to_user(buf, report, report_size))
> > >> +	oa_buf_end = stream->oa_buffer.vaddr +
> > >> +		     stream->oa_buffer.vma->size;
> > >> +	report_size_partial = oa_buf_end - report;
> > >> +
> > >> +	if (report_size_partial < report_size) {
> > >> +		if (copy_to_user(buf, report, report_size_partial))
> > >> +			return -EFAULT;
> > >> +		buf += report_size_partial;
> > >> +
> > >> +		if (copy_to_user(buf, stream->oa_buffer.vaddr,
> > >> +				 report_size - report_size_partial))
> > >> +			return -EFAULT;
> > >> +	} else if (copy_to_user(buf, report, report_size)) {
> > >>		return -EFAULT;
> > >> +	}
> > >>
> > >>	(*offset) += header.size;
> > >>
> > >> @@ -759,8 +781,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
> > >>	 * all a power of two).
> > >>	 */
> > >>	if (drm_WARN_ONCE(&uncore->i915->drm,
> > >> -			  head > OA_BUFFER_SIZE || head % report_size ||
> > >> -			  tail > OA_BUFFER_SIZE || tail % report_size,
> > >> +			  head > OA_BUFFER_SIZE ||
> > >> +			  tail > OA_BUFFER_SIZE,
> > >
> > > The comment above the if () also needs to be fixed.
> >
> > Will fix
> >
> > >
> > > Also, does it make sense to have 'head % 64 || tail % 64' checks above? As
> > > I was saying above head and tail will be 64 byte aligned.
> >
> > Since head and tail are derived from HW registers and the HW only
> > increments them by 64, we should be good even without the %64.
>
> OK.
>
> Thanks.
> --
> Ashutosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 6/9] drm/i915/perf: Parse 64bit report header formats correctly
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 6/9] drm/i915/perf: Parse 64bit report header formats correctly Umesh Nerlige Ramappa
@ 2023-02-21 22:14   ` Dixit, Ashutosh
  0 siblings, 0 replies; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-21 22:14 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Thu, 16 Feb 2023 16:58:47 -0800, Umesh Nerlige Ramappa wrote:
>
> Now that OA formats come in flavor of 64 bit reports, the report header
> has 64 bit report-id, timestamp, context-id and gpu-ticks fields. When
> filtering these reports, use the right width for these fields.
>
> Note that upper dword of context id is reserved, so squash lower dword
> only.
>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_perf.c       | 105 ++++++++++++++++++++-----
>  drivers/gpu/drm/i915/i915_perf_types.h |   6 ++
>  2 files changed, 92 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 3306653c0b85..9715b964aa1e 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -441,6 +441,75 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream)
>	return oastatus1 & GEN7_OASTATUS1_TAIL_MASK;
>  }
>
> +#define oa_report_header_64bit(__s) \
> +	((__s)->oa_buffer.format->header == HDR_64_BIT)
> +
> +static inline u64
> +oa_report_id(struct i915_perf_stream *stream, void *report)
> +{
> +	return oa_report_header_64bit(stream) ? *(u64 *)report : *(u32 *)report;
> +}
> +
> +static inline u64
> +oa_report_reason(struct i915_perf_stream *stream, void *report)
> +{
> +	return (oa_report_id(stream, report) >> OAREPORT_REASON_SHIFT) &
> +	       (GRAPHICS_VER(stream->perf->i915) == 12 ?
> +		OAREPORT_REASON_MASK_EXTENDED :
> +		OAREPORT_REASON_MASK);
> +}
> +
> +static inline void
> +oa_report_id_clear(struct i915_perf_stream *stream, u32 *report)
> +{
> +	if (oa_report_header_64bit(stream))
> +		*(u64 *)report = 0;
> +	else
> +		*report = 0;
> +}
> +
> +static inline bool
> +oa_report_ctx_invalid(struct i915_perf_stream *stream, void *report)
> +{
> +	return !(oa_report_id(stream, report) &
> +	       stream->perf->gen8_valid_ctx_bit) &&
> +	       GRAPHICS_VER(stream->perf->i915) <= 11;
> +}
> +
> +static inline u64
> +oa_timestamp(struct i915_perf_stream *stream, void *report)
> +{
> +	return oa_report_header_64bit(stream) ?
> +		*((u64 *)report + 1) :
> +		*((u32 *)report + 1);
> +}
> +
> +static inline void
> +oa_timestamp_clear(struct i915_perf_stream *stream, u32 *report)
> +{
> +	if (oa_report_header_64bit(stream))
> +		*(u64 *)&report[2] = 0;
> +	else
> +		report[1] = 0;
> +}
> +
> +static inline u32
> +oa_context_id(struct i915_perf_stream *stream, u32 *report)
> +{
> +	u32 ctx_id = oa_report_header_64bit(stream) ? report[4] : report[2];
> +
> +	return ctx_id & stream->specific_ctx_id_mask;
> +}
> +
> +static inline void
> +oa_context_id_squash(struct i915_perf_stream *stream, u32 *report)
> +{
> +	if (oa_report_header_64bit(stream))
> +		report[4] = INVALID_CTX_ID;
> +	else
> +		report[2] = INVALID_CTX_ID;
> +}

Let's drop 'inline' from all these functions, the convention is to let
compiler decide to inline.

> +
>  /**
>   * oa_buffer_check_unlocked - check for data and update tail ptr state
>   * @stream: i915 stream instance
> @@ -521,9 +590,10 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>		 * If not : (╯°□°)╯︵ ┻━┻
>		 */

Might as well fix up the comment above to say report_id and timestamp
instead of dword 0 and 1.

With the above changes, this is:

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf Umesh Nerlige Ramappa
  2023-02-17 23:37   ` Umesh Nerlige Ramappa
@ 2023-02-21 23:53   ` Dixit, Ashutosh
  2023-02-22  0:10     ` Dixit, Ashutosh
  2023-02-24 19:37     ` Umesh Nerlige Ramappa
  1 sibling, 2 replies; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-21 23:53 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Thu, 16 Feb 2023 16:58:49 -0800, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

Patch is mostly ok but a few questions below:

> Current implementation of perf defaults to render and configures the
> default OAG unit. Since there are more OA units on newer hardware, allow
> user to pass engine class and instance to program specific OA units.

I think we should more clearly say here that the OA unit is identified by
means of one of the engines (class/instance of that engine) associated with
that OA unit. The engine -> OA unit mapping is a static mapping depending
on the particular platform. Something like that.

>
> UMD specific changes for GPUvis support:
> https://patchwork.freedesktop.org/patch/522827/?series=114023
> https://patchwork.freedesktop.org/patch/522822/?series=114023
> https://patchwork.freedesktop.org/patch/522826/?series=114023
> https://patchwork.freedesktop.org/patch/522828/?series=114023
> https://patchwork.freedesktop.org/patch/522816/?series=114023
> https://patchwork.freedesktop.org/patch/522825/?series=114023
>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_perf.c | 49 +++++++++++++++++++-------------
>  include/uapi/drm/i915_drm.h      | 20 +++++++++++++
>  2 files changed, 49 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index d3a1892c93be..f028df812067 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -4035,40 +4035,29 @@ static int read_properties_unlocked(struct i915_perf *perf,
>	struct drm_i915_gem_context_param_sseu user_sseu;
>	u64 __user *uprop = uprops;
>	bool config_sseu = false;
> +	u8 class, instance;
>	u32 i;
>	int ret;
>
>	memset(props, 0, sizeof(struct perf_open_properties));
>	props->poll_oa_period = DEFAULT_POLL_PERIOD_NS;
>
> -	if (!n_props) {
> -		drm_dbg(&perf->i915->drm,
> -			"No i915 perf properties given\n");
> -		return -EINVAL;
> -	}
> -
> -	/* At the moment we only support using i915-perf on the RCS. */
> -	props->engine = intel_engine_lookup_user(perf->i915,
> -						 I915_ENGINE_CLASS_RENDER,
> -						 0);
> -	if (!props->engine) {
> -		drm_dbg(&perf->i915->drm,
> -			"No RENDER-capable engines\n");
> -		return -EINVAL;
> -	}
> -
>	/* Considering that ID = 0 is reserved and assuming that we don't
>	 * (currently) expect any configurations to ever specify duplicate
>	 * values for a particular property ID then the last _PROP_MAX value is
>	 * one greater than the maximum number of properties we expect to get
>	 * from userspace.
>	 */
> -	if (n_props >= DRM_I915_PERF_PROP_MAX) {
> +	if (!n_props || n_props >= DRM_I915_PERF_PROP_MAX) {
>		drm_dbg(&perf->i915->drm,
> -			"More i915 perf properties specified than exist\n");
> +			"Invalid no. of i915 perf properties given\n");

Invalid number

>		return -EINVAL;
>	}
>
> +	/* Defaults when class:instance is not passed */
> +	class = I915_ENGINE_CLASS_RENDER;
> +	instance = 0;
> +
>	for (i = 0; i < n_props; i++) {
>		u64 oa_period, oa_freq_hz;
>		u64 id, value;
> @@ -4189,7 +4178,13 @@ static int read_properties_unlocked(struct i915_perf *perf,
>			}
>			props->poll_oa_period = value;
>			break;
> -		case DRM_I915_PERF_PROP_MAX:
> +		case DRM_I915_PERF_PROP_OA_ENGINE_CLASS:
> +			class = (u8)value;
> +			break;
> +		case DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE:
> +			instance = (u8)value;
> +			break;
> +		default:
>			MISSING_CASE(id);
>			return -EINVAL;
>		}
> @@ -4197,6 +4192,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
>		uprop += 2;
>	}
>
> +	props->engine = intel_engine_lookup_user(perf->i915, class, instance);
> +	if (!props->engine) {
> +		drm_dbg(&perf->i915->drm,
> +			"OA engine class and instance invalid %d:%d\n",
> +			class, instance);
> +		return -EINVAL;
> +	}
> +
> +	if (!engine_supports_oa(props->engine))
> +		return -EINVAL;

Need drm_dbg here too?

> +
>	if (config_sseu) {
>		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
>		if (ret) {
> @@ -5208,8 +5214,11 @@ int i915_perf_ioctl_version(void)
>	 *
>	 * 5: Add DRM_I915_PERF_PROP_POLL_OA_PERIOD parameter that controls the
>	 *    interval for the hrtimer used to check for OA data.
> +	 *
> +	 * 6: Add DRM_I915_PERF_PROP_OA_ENGINE_CLASS and
> +	 *    DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE
>	 */
> -	return 5;
> +	return 6;

Do we need a separate revision for this? Maybe club it with OAM support
since that is where this is getting introduced?

>  }
>
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 8df261c5ab9b..b6922b52d85c 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -2758,6 +2758,26 @@ enum drm_i915_perf_property_id {
>	 */
>	DRM_I915_PERF_PROP_POLL_OA_PERIOD,
>
> +	/**
> +	 * In platforms with multiple OA buffers, the engine class instance must
> +	 * be passed to open a stream to a OA unit corresponding to the engine.
> +	 * Multiple engines may be mapped to the same OA unit.
> +	 *
> +	 * In addition to the class:instance, if a gem context is also passed, then
> +	 * 1) the report headers of OA reports from other engines are squashed.

Other engines or you mean other contexts?

> +	 * 2) OAR is enabled for the class:instance

For render engine?

Maybe the above comments need to be more crisp since they are in i915_drm.h
or is it only I who is confused :)

> +	 *
> +	 * This property is available in perf revision 6.
> +	 */
> +	DRM_I915_PERF_PROP_OA_ENGINE_CLASS,
> +
> +	/**
> +	 * This parameter specifies the engine instance.
> +	 *
> +	 * This property is available in perf revision 6.
> +	 */
> +	DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE,
> +
>	DRM_I915_PERF_PROP_MAX /* non-ABI */
>  };
>
> --
> 2.36.1
>

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf
  2023-02-21 23:53   ` Dixit, Ashutosh
@ 2023-02-22  0:10     ` Dixit, Ashutosh
  2023-02-24 19:37     ` Umesh Nerlige Ramappa
  1 sibling, 0 replies; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-22  0:10 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Tue, 21 Feb 2023 15:53:57 -0800, Dixit, Ashutosh wrote:
>
> > +	/**
> > +	 * In platforms with multiple OA buffers, the engine class instance must
> > +	 * be passed to open a stream to a OA unit corresponding to the engine.
> > +	 * Multiple engines may be mapped to the same OA unit.
> > +	 *
> > +	 * In addition to the class:instance, if a gem context is also passed, then
> > +	 * 1) the report headers of OA reports from other engines are squashed.
>
> Other engines or you mean other contexts?
>
> > +	 * 2) OAR is enabled for the class:instance
>
> For render engine?

I think this means the engine will support MI_REPORT_PERF_COUNT but not
sure how this (or if anything) is done.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 4/9] drm/i915/perf: Group engines into respective OA groups
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 4/9] drm/i915/perf: Group engines into respective OA groups Umesh Nerlige Ramappa
@ 2023-02-22 21:52   ` Dixit, Ashutosh
  2023-02-24 17:30     ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-22 21:52 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Thu, 16 Feb 2023 16:58:45 -0800, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

> Now that we may have multiple OA units in a single GT as well as on
> separate GTs, create an engine group that maps to a single OA unit.
>
> v2: (Jani)
> - Drop warning on ENOMEM
> - Reorder patch in the series
>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_types.h |   4 +
>  drivers/gpu/drm/i915/gt/intel_sseu.c         |   3 +-
>  drivers/gpu/drm/i915/i915_perf.c             | 124 +++++++++++++++++--
>  drivers/gpu/drm/i915/i915_perf_types.h       |  51 +++++++-
>  4 files changed, 169 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 4fd54fb8810f..8a8b0dce241b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -53,6 +53,8 @@ struct intel_gt;
>  struct intel_ring;
>  struct intel_uncore;
>  struct intel_breadcrumbs;
> +struct intel_engine_cs;
> +struct i915_perf_group;
>
>  typedef u32 intel_engine_mask_t;
>  #define ALL_ENGINES ((intel_engine_mask_t)~0ul)
> @@ -603,6 +605,8 @@ struct intel_engine_cs {
>	} props, defaults;
>
>	I915_SELFTEST_DECLARE(struct fault_attr reset_timeout);
> +
> +	struct i915_perf_group *oa_group;

I think 'struct i915_oa_unit' is a better name (since it suggests a HW
entity), but since if we change we'll need to change everywhere so leave as
is with a comment to the effect that:

	1 OA unit <-> 1 OA buffer <-> 1 perf group

>  };
>
>  static inline bool
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
> index 6c6198a257ac..1141f875f5bd 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> @@ -6,6 +6,7 @@
>  #include <linux/string_helpers.h>
>
>  #include "i915_drv.h"
> +#include "i915_perf_types.h"
>  #include "intel_engine_regs.h"
>  #include "intel_gt_regs.h"
>  #include "intel_sseu.h"
> @@ -677,7 +678,7 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt,
>	 * If i915/perf is active, we want a stable powergating configuration
>	 * on the system. Use the configuration pinned by i915/perf.
>	 */
> -	if (gt->perf.exclusive_stream)
> +	if (gt->perf.group && gt->perf.group[PERF_GROUP_OAG].exclusive_stream)

I haven't looked into what this function does, hopefully ok to do this only
for OAG?

>		req_sseu = &gt->perf.sseu;
>
>	slices = hweight8(req_sseu->slice_mask);
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 1229f65534e2..37c4cc44d68c 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -1584,8 +1584,9 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
>  {
>	struct i915_perf *perf = stream->perf;
>	struct intel_gt *gt = stream->engine->gt;
> +	struct i915_perf_group *g = stream->engine->oa_group;
>
> -	if (WARN_ON(stream != gt->perf.exclusive_stream))
> +	if (WARN_ON(stream != g->exclusive_stream))
>		return;
>
>	/*
> @@ -1594,7 +1595,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
>	 *
>	 * See i915_oa_init_reg_state() and lrc_configure_all_contexts()
>	 */
> -	WRITE_ONCE(gt->perf.exclusive_stream, NULL);
> +	WRITE_ONCE(g->exclusive_stream, NULL);
>	perf->ops.disable_metric_set(stream);
>
>	free_oa_buffer(stream);
> @@ -3192,6 +3193,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>  {
>	struct drm_i915_private *i915 = stream->perf->i915;
>	struct i915_perf *perf = stream->perf;
> +	struct i915_perf_group *g;
>	struct intel_gt *gt;
>	int ret;
>
> @@ -3202,6 +3204,12 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>	}
>	gt = props->engine->gt;
>
> +	g = props->engine->oa_group;
> +	if (!g) {
> +		DRM_DEBUG("Perf group invalid\n");
> +		return -EINVAL;
> +	}

This check should be moved to the engine_supports_oa check in
read_properties_unlocked in "drm/i915/perf: Add engine class instance
parameters to perf". It basically duplicates that check I think.

Or rather, engine_supports_oa check should be now be re-written as follows
I think:

static bool engine_supports_oa(const struct intel_engine_cs *engine)
{
	return engine->oa_group;
}

Since there are many more instances of engine_supports_oa calls.

If we do this in read_properties_unlocked we don't need the above check
here.

> +
>	/*
>	 * If the sysfs metrics/ directory wasn't registered for some
>	 * reason then don't let userspace try their luck with config
> @@ -3231,7 +3239,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>	 * counter reports and marshal to the appropriate client
>	 * we currently only allow exclusive access
>	 */
> -	if (gt->perf.exclusive_stream) {
> +	if (g->exclusive_stream) {
>		drm_dbg(&stream->perf->i915->drm,
>			"OA unit already in use\n");
>		return -EBUSY;
> @@ -3326,7 +3334,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>	stream->ops = &i915_oa_stream_ops;
>
>	stream->engine->gt->perf.sseu = props->sseu;
> -	WRITE_ONCE(gt->perf.exclusive_stream, stream);
> +	WRITE_ONCE(g->exclusive_stream, stream);
>
>	ret = i915_perf_stream_enable_sync(stream);
>	if (ret) {
> @@ -3349,7 +3357,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>	return 0;
>
>  err_enable:
> -	WRITE_ONCE(gt->perf.exclusive_stream, NULL);
> +	WRITE_ONCE(g->exclusive_stream, NULL);
>	perf->ops.disable_metric_set(stream);
>
>	free_oa_buffer(stream);
> @@ -3378,12 +3386,13 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
>			    const struct intel_engine_cs *engine)
>  {
>	struct i915_perf_stream *stream;
> +	struct i915_perf_group *g = engine->oa_group;
>
> -	if (!engine_supports_oa(engine))
> +	if (!g)
>		return;
>
>	/* perf.exclusive_stream serialised by lrc_configure_all_contexts() */
> -	stream = READ_ONCE(engine->gt->perf.exclusive_stream);
> +	stream = READ_ONCE(g->exclusive_stream);
>	if (stream && GRAPHICS_VER(stream->perf->i915) < 12)
>		gen8_update_reg_state_unlocked(ce, stream);
>  }
> @@ -4753,6 +4762,95 @@ static struct ctl_table oa_table[] = {
>	{}
>  };
>
> +static u32 __num_perf_groups_per_gt(struct intel_gt *gt)
> +{
> +	enum intel_platform platform = INTEL_INFO(gt->i915)->platform;
> +
> +	switch (platform) {
> +	default:
> +		return 1;
> +	}

I think in this function let us just say 'return 1' since we have not
introduced a value other than 1 in this series, so no need for the switch
statement I think.

> +}
> +
> +static u32 __oa_engine_group(struct intel_engine_cs *engine)
> +{
> +	if (!engine_supports_oa(engine))
> +		return PERF_GROUP_INVALID;
> +
> +	switch (engine->class) {
> +	case RENDER_CLASS:
> +		return PERF_GROUP_OAG;
> +
> +	default:
> +		return PERF_GROUP_INVALID;
> +	}
> +}
> +
> +static void oa_init_groups(struct intel_gt *gt)
> +{
> +	int i, num_groups = gt->perf.num_perf_groups;
> +	struct i915_perf *perf = &gt->i915->perf;
> +
> +	for (i = 0; i < num_groups; i++) {
> +		struct i915_perf_group *g = &gt->perf.group[i];
> +
> +		/* Fused off engines can result in a group with num_engines == 0 */
> +		if (g->num_engines == 0)
> +			continue;
> +
> +		/* Set oa_unit_ids now to ensure ids remain contiguous. */
> +		g->oa_unit_id = perf->oa_unit_ids++;
> +
> +		g->gt = gt;
> +	}
> +}
> +
> +static int oa_init_gt(struct intel_gt *gt)
> +{
> +	u32 num_groups = __num_perf_groups_per_gt(gt);
> +	struct intel_engine_cs *engine;
> +	struct i915_perf_group *g;
> +	intel_engine_mask_t tmp;
> +
> +	g = kcalloc(num_groups, sizeof(*g), GFP_KERNEL);
> +	if (!g)
> +		return -ENOMEM;
> +
> +	for_each_engine_masked(engine, gt, ALL_ENGINES, tmp) {
> +		u32 index;
> +
> +		index = __oa_engine_group(engine);
> +		if (index < num_groups) {
> +			g[index].engine_mask |= BIT(engine->id);
> +			g[index].num_engines++;
> +			engine->oa_group = &g[index];
> +		} else {
> +			engine->oa_group = NULL;
> +		}

We can avoid the else by initializing engine->oa_group to NULL at the start
of the for_each_engine_masked loop.

> +	}
> +
> +	gt->perf.num_perf_groups = num_groups;
> +	gt->perf.group = g;
> +
> +	oa_init_groups(gt);
> +
> +	return 0;
> +}
> +
> +static int oa_init_engine_groups(struct i915_perf *perf)
> +{
> +	struct intel_gt *gt;
> +	int i, ret;
> +
> +	for_each_gt(gt, perf->i915, i) {
> +		ret = oa_init_gt(gt);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +
>  static void oa_init_supported_formats(struct i915_perf *perf)
>  {
>	struct drm_i915_private *i915 = perf->i915;
> @@ -4919,7 +5017,7 @@ void i915_perf_init(struct drm_i915_private *i915)
>
>	if (perf->ops.enable_metric_set) {
>		struct intel_gt *gt;
> -		int i;
> +		int i, ret;
>
>		for_each_gt(gt, i915, i)
>			mutex_init(&gt->perf.lock);
> @@ -4958,6 +5056,11 @@ void i915_perf_init(struct drm_i915_private *i915)
>
>		perf->i915 = i915;
>
> +		ret = oa_init_engine_groups(perf);
> +		if (ret)
> +			drm_err(&i915->drm,
> +				"OA initialization failed %d\n", ret);
> +
>		oa_init_supported_formats(perf);
>	}
>  }
> @@ -4986,10 +5089,15 @@ void i915_perf_sysctl_unregister(void)
>  void i915_perf_fini(struct drm_i915_private *i915)
>  {
>	struct i915_perf *perf = &i915->perf;
> +	struct intel_gt *gt;
> +	int i;
>
>	if (!perf->i915)
>		return;
>
> +	for_each_gt(gt, perf->i915, i)
> +		kfree(gt->perf.group);
> +
>	idr_for_each(&perf->metrics_idr, destroy_config, perf);
>	idr_destroy(&perf->metrics_idr);
>
> diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
> index e36f046fe2b6..ce99551ad0fd 100644
> --- a/drivers/gpu/drm/i915/i915_perf_types.h
> +++ b/drivers/gpu/drm/i915/i915_perf_types.h
> @@ -17,6 +17,7 @@
>  #include <linux/wait.h>
>  #include <uapi/drm/i915_drm.h>
>
> +#include "gt/intel_engine_types.h"
>  #include "gt/intel_sseu.h"
>  #include "i915_reg_defs.h"
>  #include "intel_wakeref.h"
> @@ -30,6 +31,13 @@ struct i915_vma;
>  struct intel_context;
>  struct intel_engine_cs;
>

For below 'assigned but not used comments' I am basing this on this patch
and the last OAM patch 9/9.

> +enum {
> +	PERF_GROUP_OAG = 0,
> +
> +	PERF_GROUP_MAX,
> +	PERF_GROUP_INVALID = U32_MAX,
> +};
> +
>  struct i915_oa_format {
>	u32 format;
>	int size;
> @@ -390,6 +398,35 @@ struct i915_oa_ops {
>	u32 (*oa_hw_tail_read)(struct i915_perf_stream *stream);
>  };
>
> +struct i915_perf_group {
> +	/*
> +	 * @type: Identifier for the OA unit.
> +	 */
> +	u32 oa_unit_id;

Assigned but not used, should be removed and introduced when needed.

> +
> +	/*
> +	 * @gt: gt that this group belongs to
> +	 */
> +	struct intel_gt *gt;

Not used either, suggest removing.

> +
> +	/*
> +	 * @exclusive_stream: The stream currently using the OA unit. This is
> +	 * sometimes accessed outside a syscall associated to its file
> +	 * descriptor.
> +	 */
> +	struct i915_perf_stream *exclusive_stream;
> +
> +	/*
> +	 * @num_engines: The number of engines using this OA buffer.

s/OA buffer/OA unit/

> +	 */
> +	u32 num_engines;
> +
> +	/*
> +	 * @engine_mask: A mask of engines using a single OA buffer.

s/OA buffer/OA unit/

> +	 */
> +	intel_engine_mask_t engine_mask;

Assigned but not used, should be removed and introduced when needed.

> +};
> +
>  struct i915_perf_gt {
>	/*
>	 * Lock associated with anything below within this structure.
> @@ -402,12 +439,15 @@ struct i915_perf_gt {
>	 */
>	struct intel_sseu sseu;
>
> +	/**
> +	 * @num_perf_groups: number of perf groups per gt.
> +	 */
> +	u32 num_perf_groups;

This is 1 in this series so you could argue not needed but I think we know
some future platforms where there might be > 1 so I think we can leave it
in.

> +
>	/*
> -	 * @exclusive_stream: The stream currently using the OA unit. This is
> -	 * sometimes accessed outside a syscall associated to its file
> -	 * descriptor.
> +	 * @group: list of OA groups - one for each OA buffer.
>	 */
> -	struct i915_perf_stream *exclusive_stream;
> +	struct i915_perf_group *group;
>  };
>
>  struct i915_perf {
> @@ -461,6 +501,9 @@ struct i915_perf {
>	unsigned long format_mask[FORMAT_MASK_SIZE];
>
>	atomic64_t noa_programming_delay;
> +
> +	/* oa unit ids */
> +	u32 oa_unit_ids;

Assigned but not used, should be removed, because oa_unit_id is
unused. Also if we need to do this later maybe idr is a better approach?

Also, if we remove the above members as suggested, oa_init_groups will
probably need to move to the last patch 9.

>  };
>
>  #endif /* _I915_PERF_TYPES_H_ */
> --
> 2.36.1
>

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units
  2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units Umesh Nerlige Ramappa
  2023-02-17 23:37   ` Umesh Nerlige Ramappa
@ 2023-02-23 20:05   ` Dixit, Ashutosh
  2023-02-25  0:58     ` Umesh Nerlige Ramappa
  1 sibling, 1 reply; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-23 20:05 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Thu, 16 Feb 2023 16:58:50 -0800, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

> MTL introduces additional OA units dedicated to media use cases. Add
> support for programming these OA units by passing the media engine class
> and instance parameters.
>
> UMD specific changes for GPUvis support:
> https://patchwork.freedesktop.org/patch/522827/?series=114023
> https://patchwork.freedesktop.org/patch/522822/?series=114023
> https://patchwork.freedesktop.org/patch/522826/?series=114023
> https://patchwork.freedesktop.org/patch/522828/?series=114023
> https://patchwork.freedesktop.org/patch/522816/?series=114023
> https://patchwork.freedesktop.org/patch/522825/?series=114023

General comment about the patch in case I miss something out, as I've
mentioned previously in general let's try to replace INTEL_METEORLAKE and
IS_METEORLAKE checks in the patch with:

	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))

So that we don't have to enumerate each platform individually later.

> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h          |   2 +
>  drivers/gpu/drm/i915/i915_pci.c          |   1 +
>  drivers/gpu/drm/i915/i915_perf.c         | 247 ++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_perf_oa_regs.h |  78 +++++++
>  drivers/gpu/drm/i915/i915_perf_types.h   |  40 ++++
>  drivers/gpu/drm/i915/intel_device_info.h |   1 +
>  include/uapi/drm/i915_drm.h              |   4 +
>  7 files changed, 347 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 0393273faa09..f3cacbf41c86 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -856,6 +856,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>	(INTEL_INFO(dev_priv)->has_oa_bpc_reporting)
>  #define HAS_OA_SLICE_CONTRIB_LIMITS(dev_priv) \
>	(INTEL_INFO(dev_priv)->has_oa_slice_contrib_limits)
> +#define HAS_OAM(dev_priv) \
> +	(INTEL_INFO(dev_priv)->has_oam)
>
>  /*
>   * Set this flag, when platform requires 64K GTT page sizes or larger for
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index a8d942b16223..621730b6551c 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -1028,6 +1028,7 @@ static const struct intel_device_info adl_p_info = {
>	.has_mslice_steering = 1, \
>	.has_oa_bpc_reporting = 1, \
>	.has_oa_slice_contrib_limits = 1, \
> +	.has_oam = 1, \
>	.has_rc6 = 1, \
>	.has_reset_engine = 1, \
>	.has_rps = 1, \
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index f028df812067..a57690f4c531 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -192,6 +192,7 @@
>   */
>
>  #include <linux/anon_inodes.h>
> +#include <linux/nospec.h>
>  #include <linux/sizes.h>
>  #include <linux/uuid.h>
>
> @@ -326,6 +327,13 @@ static const struct i915_oa_format oa_formats[I915_OA_FORMAT_MAX] = {
>	[I915_OA_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 },
>	[I915_OAR_FORMAT_A32u40_A4u32_B8_C8]    = { 5, 256 },
>	[I915_OA_FORMAT_A24u40_A14u32_B8_C8]    = { 5, 256 },
> +	[I915_OAM_FORMAT_MPEC8u64_B8_C8]	= { 1, 192, TYPE_OAM, HDR_64_BIT },
> +	[I915_OAM_FORMAT_MPEC8u32_B8_C8]	= { 2, 128, TYPE_OAM, HDR_64_BIT },
> +};
> +
> +/* PERF_GROUP_OAG is unused for oa_base, drop it for mtl */

What does this comment mean?

> +static const u32 mtl_oa_base[] = {
> +	[PERF_GROUP_OAM_SAMEDIA_0] = 0x393000,
>  };
>
>  #define SAMPLE_OA_REPORT      (1<<0)
> @@ -418,11 +426,17 @@ static void free_oa_config_bo(struct i915_oa_config_bo *oa_bo)
>	kfree(oa_bo);
>  }
>
> +static inline const
> +struct i915_perf_regs *__oa_regs(struct i915_perf_stream *stream)
> +{
> +	return &stream->oa_buffer.group->regs;

Should just use stream->engine->oa_group->regs, see near the bottom.

> +}
> +
>  static u32 gen12_oa_hw_tail_read(struct i915_perf_stream *stream)
>  {
>	struct intel_uncore *uncore = stream->uncore;
>
> -	return intel_uncore_read(uncore, GEN12_OAG_OATAILPTR) &
> +	return intel_uncore_read(uncore, __oa_regs(stream)->oa_tail_ptr) &
>	       GEN12_OAG_OATAILPTR_MASK;
>  }
>
> @@ -886,7 +900,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
>		i915_reg_t oaheadptr;
>
>		oaheadptr = GRAPHICS_VER(stream->perf->i915) == 12 ?

>= 12 ?

> -			    GEN12_OAG_OAHEADPTR : GEN8_OAHEADPTR;
> +			    __oa_regs(stream)->oa_head_ptr :
> +			    GEN8_OAHEADPTR;
>
>		spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>
> @@ -939,7 +954,8 @@ static int gen8_oa_read(struct i915_perf_stream *stream,
>		return -EIO;
>
>	oastatus_reg = GRAPHICS_VER(stream->perf->i915) == 12 ?

>= 12 ?

> -		       GEN12_OAG_OASTATUS : GEN8_OASTATUS;
> +		       __oa_regs(stream)->oa_status :
> +		       GEN8_OASTATUS;
>
>	oastatus = intel_uncore_read(uncore, oastatus_reg);
>
> @@ -1643,16 +1659,46 @@ free_noa_wait(struct i915_perf_stream *stream)
>	i915_vma_unpin_and_release(&stream->noa_wait, 0);
>  }
>
> +/*
> + * intel_engine_lookup_user ensures that most of engine specific checks are
> + * taken care of, however, we can run into a case where the OA unit catering to
> + * the engine passed by the user is disabled for some reason. In such cases,
> + * ensure oa unit corresponding to an engine is functional. If there are no
> + * engines in the group, the unit is disabled.
> + */
> +static bool oa_unit_functional(const struct intel_engine_cs *engine)
> +{
> +	return engine->oa_group && engine->oa_group->num_engines;
> +}
> +
>  static bool engine_supports_oa(const struct intel_engine_cs *engine)
>  {
>	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
>
>	switch (platform) {
> +	case INTEL_METEORLAKE:
> +		return engine->class == RENDER_CLASS ||
> +		       ((engine->class == VIDEO_DECODE_CLASS ||
> +			 engine->class == VIDEO_ENHANCEMENT_CLASS) &&
> +			engine->gt->type == GT_MEDIA);
>	default:
>		return engine->class == RENDER_CLASS;
>	}

As mentioned in a previous patch, this could just be:

	return engine->oa_group;

Because all these checks have already been done when the perf groups were
initialized so let's use that, as is done for oa_unit_functional.

Though, caution, to return engine->oa_group we'd have to remove
engine_supports_oa from __oa_engine_group, since engine->oa_group is not
yet assigned there. But I think the engine_supports_oa check in
__oa_engine_group is a duplication and should be removed.

>  }
>
> +static bool engine_class_supports_oa_format(struct intel_engine_cs *engine, int type)
> +{
> +	switch (engine->class) {
> +	case RENDER_CLASS:
> +		return type == TYPE_OAG;
> +	case VIDEO_DECODE_CLASS:
> +	case VIDEO_ENHANCEMENT_CLASS:
> +		return type == TYPE_OAM;
> +	default:
> +		return false;
> +	}
> +}
> +

Again, how about:

	return engine->oa_group && engine->oa_group->type == type;

Otherwise as mentioned below oa_group->type is unused and also incorrectly
assigned at present. The format type and group types are the same
(TYPE_OAG/TYPE_OAM). Can name the function engine_supports_oa_format.

>  static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
>  {
>	struct i915_perf *perf = stream->perf;
> @@ -1680,7 +1726,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
>		drm_WARN_ON(&gt->i915->drm,
>			    intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc));
>
> -	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
> +	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
>	intel_engine_pm_put(stream->engine);
>
>	if (stream->ctx)
> @@ -1804,8 +1850,8 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
>
>	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>
> -	intel_uncore_write(uncore, GEN12_OAG_OASTATUS, 0);
> -	intel_uncore_write(uncore, GEN12_OAG_OAHEADPTR,
> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_status, 0);
> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_head_ptr,
>			   gtt_offset & GEN12_OAG_OAHEADPTR_MASK);
>	stream->oa_buffer.head = gtt_offset;
>
> @@ -1817,9 +1863,9 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
>	 *  to enable proper functionality of the overflow
>	 *  bit."
>	 */
> -	intel_uncore_write(uncore, GEN12_OAG_OABUFFER, gtt_offset |
> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_buffer, gtt_offset |
>			   OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT);
> -	intel_uncore_write(uncore, GEN12_OAG_OATAILPTR,
> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_tail_ptr,
>			   gtt_offset & GEN12_OAG_OATAILPTR_MASK);
>
>	/* Mark that we need updated tail pointers to read from... */
> @@ -2579,7 +2625,8 @@ gen8_modify_self(struct intel_context *ce,
>	return err;
>  }
>
> -static int gen8_configure_context(struct i915_gem_context *ctx,
> +static int gen8_configure_context(struct i915_perf_stream *stream,
> +				  struct i915_gem_context *ctx,
>				  struct flex *flex, unsigned int count)
>  {
>	struct i915_gem_engines_iter it;
> @@ -2589,7 +2636,8 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
>	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
>		GEM_BUG_ON(ce == ce->engine->kernel_context);
>
> -		if (!engine_supports_oa(ce->engine))
> +		if (!engine_supports_oa(ce->engine) ||
> +		    ce->engine->class != stream->engine->class)
>			continue;
>
>		/* Otherwise OA settings will be set upon first use */
> @@ -2720,7 +2768,7 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
>
>		spin_unlock(&i915->gem.contexts.lock);
>
> -		err = gen8_configure_context(ctx, regs, num_regs);
> +		err = gen8_configure_context(stream, ctx, regs, num_regs);
>		if (err) {
>			i915_gem_context_put(ctx);
>			return err;
> @@ -2740,7 +2788,8 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
>	for_each_uabi_engine(engine, i915) {
>		struct intel_context *ce = engine->kernel_context;
>
> -		if (!engine_supports_oa(ce->engine))
> +		if (!engine_supports_oa(ce->engine) ||
> +		    ce->engine->class != stream->engine->class)
>			continue;
>
>		regs[0].value = intel_sseu_make_rpcs(engine->gt, &ce->sseu);
> @@ -2765,6 +2814,9 @@ gen12_configure_all_contexts(struct i915_perf_stream *stream,
>		},
>	};
>
> +	if (stream->engine->class != RENDER_CLASS)
> +		return 0;

OK, this is for render, nothing equivalent needed for media?

> +
>	return oa_configure_all_contexts(stream,
>					 regs, ARRAY_SIZE(regs),
>					 active);
> @@ -2894,7 +2946,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
>				   _MASKED_BIT_ENABLE(GEN12_DISABLE_DOP_GATING));
>	}
>
> -	intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG,
> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_debug,
>			   /* Disable clk ratio reports, like previous Gens. */
>			   _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS |
>					      GEN12_OAG_OA_DEBUG_INCLUDE_CLK_RATIO) |
> @@ -2904,7 +2956,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
>			    */
>			   oag_report_ctx_switches(stream));
>
> -	intel_uncore_write(uncore, GEN12_OAG_OAGLBCTXCTRL, periodic ?
> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctx_ctrl, periodic ?
>			   (GEN12_OAG_OAGLBCTXCTRL_COUNTER_RESUME |
>			    GEN12_OAG_OAGLBCTXCTRL_TIMER_ENABLE |
>			    (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT))
> @@ -3058,8 +3110,8 @@ static void gen8_oa_enable(struct i915_perf_stream *stream)
>
>  static void gen12_oa_enable(struct i915_perf_stream *stream)
>  {
> -	struct intel_uncore *uncore = stream->uncore;
> -	u32 report_format = stream->oa_buffer.format->format;
> +	const struct i915_perf_regs *regs;
> +	u32 val;
>
>	/*
>	 * If we don't want OA reports from the OA buffer, then we don't even
> @@ -3070,9 +3122,11 @@ static void gen12_oa_enable(struct i915_perf_stream *stream)
>
>	gen12_init_oa_buffer(stream);
>
> -	intel_uncore_write(uncore, GEN12_OAG_OACONTROL,
> -			   (report_format << GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT) |
> -			   GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE);
> +	regs = __oa_regs(stream);
> +	val = (stream->oa_buffer.format->format << regs->oa_ctrl_counter_format_shift) |
> +	      GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE;
> +
> +	intel_uncore_write(stream->uncore, regs->oa_ctrl, val);
>  }
>
>  /**
> @@ -3124,9 +3178,9 @@ static void gen12_oa_disable(struct i915_perf_stream *stream)
>  {
>	struct intel_uncore *uncore = stream->uncore;
>
> -	intel_uncore_write(uncore, GEN12_OAG_OACONTROL, 0);
> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctrl, 0);
>	if (intel_wait_for_register(uncore,
> -				    GEN12_OAG_OACONTROL,
> +				    __oa_regs(stream)->oa_ctrl,
>				    GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE, 0,
>				    50))
>		drm_err(&stream->perf->i915->drm,
> @@ -3329,6 +3383,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>
>	stream->sample_size = sizeof(struct drm_i915_perf_record_header);
>
> +	stream->oa_buffer.group = g;

Should just use stream->engine->oa_group, see near the bottom.

>	stream->oa_buffer.format = &perf->oa_formats[props->oa_format];
>	if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format->size == 0))
>		return -EINVAL;
> @@ -3379,7 +3434,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>	 *   references will effectively disable RC6.
>	 */
>	intel_engine_pm_get(stream->engine);
> -	intel_uncore_forcewake_get(stream->uncore, FORCEWAKE_ALL);
> +	intel_uncore_forcewake_get(stream->uncore, g->fw_domains);
>
>	/*
>	 * Wa_16011777198:dg2: GuC resets render as part of the Wa. This causes
> @@ -3440,7 +3495,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>		intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc);
>
>  err_gucrc:
> -	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
> +	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
>	intel_engine_pm_put(stream->engine);
>
>	free_oa_configs(stream);
> @@ -4033,6 +4088,7 @@ static int read_properties_unlocked(struct i915_perf *perf,
>				    struct perf_open_properties *props)
>  {
>	struct drm_i915_gem_context_param_sseu user_sseu;
> +	const struct i915_oa_format *f;
>	u64 __user *uprop = uprops;
>	bool config_sseu = false;
>	u8 class, instance;
> @@ -4203,6 +4259,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
>	if (!engine_supports_oa(props->engine))
>		return -EINVAL;
>
> +	if (!oa_unit_functional(props->engine))
> +		return -ENODEV;
> +
> +	i = array_index_nospec(props->oa_format, I915_OA_FORMAT_MAX);

Why do we need this (something to do with speculation)? Can just do
'&perf->oa_formats[props->oa_format]' below? The format passed in has
already been checked in the switch statement above.

> +	f = &perf->oa_formats[i];
> +	if (!engine_class_supports_oa_format(props->engine, f->type)) {
> +		DRM_DEBUG("Invalid OA format %d for class %d\n",
> +			  f->type, props->engine->class);
> +		return -EINVAL;
> +	}
> +
>	if (config_sseu) {
>		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
>		if (ret) {
> @@ -4383,6 +4450,14 @@ static const struct i915_range gen12_oa_b_counters[] = {
>	{}
>  };
>
> +static const struct i915_range mtl_oam_b_counters[] = {
> +	{ .start = 0x393000, .end = 0x39301c },	/* GEN12_OAM_STARTTRIG1[1-8] */
> +	{ .start = 0x393020, .end = 0x39303c },	/* GEN12_OAM_REPORTTRIG1[1-8] */
> +	{ .start = 0x393040, .end = 0x39307c },	/* GEN12_OAM_CEC[0-7][0-1] */
> +	{ .start = 0x393200, .end = 0x39323C },	/* MPES[0-7] */
> +	{}
> +};
> +
>  static const struct i915_range xehp_oa_b_counters[] = {
>	{ .start = 0xdc48, .end = 0xdc48 },	/* OAA_ENABLE_REG */
>	{ .start = 0xdd00, .end = 0xdd48 },	/* OAG_LCE0_0 - OAA_LENABLE_REG */
> @@ -4429,13 +4504,16 @@ static const struct i915_range gen12_oa_mux_regs[] = {
>
>  /*
>   * Ref: 14010536224:
> - * 0x20cc is repurposed on MTL, so use a separate array for MTL.
> + * 0x20cc is repurposed on MTL, so use a separate array for MTL. Also add the
> + * MPES/MPEC registers.

MPES/MPEC registers are added above now, not here so maybe get rid of the
comment change above?

>   */
>  static const struct i915_range mtl_oa_mux_regs[] = {
>	{ .start = 0x0d00, .end = 0x0d04 },	/* RPM_CONFIG[0-1] */
>	{ .start = 0x0d0c, .end = 0x0d2c },	/* NOA_CONFIG[0-8] */
>	{ .start = 0x9840, .end = 0x9840 },	/* GDT_CHICKEN_BITS */
>	{ .start = 0x9884, .end = 0x9888 },	/* NOA_WRITE */
> +	{ .start = 0x38d100, .end = 0x38d114},	/* VISACTL */
> +	{}
>  };
>
>  static bool gen7_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
> @@ -4473,10 +4551,26 @@ static bool gen12_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
>	return reg_in_range_table(addr, gen12_oa_b_counters);
>  }
>
> +static bool xehp_is_valid_oam_b_counter_addr(struct i915_perf *perf, u32 addr)
> +{
> +	enum intel_platform platform = INTEL_INFO(perf->i915)->platform;
> +
> +	if (!HAS_OAM(perf->i915))
> +		return false;
> +
> +	switch (platform) {
> +	case INTEL_METEORLAKE:
> +		return reg_in_range_table(addr, mtl_oam_b_counters);
> +	default:
> +		return false;
> +	}

Replace with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))', registers
are identical in later platforms too.

Should the function prefix be xehp or mtl? Don't see xehp in bspec,
probably xehp is discontinued.

> +}
> +
>  static bool xehp_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
>  {
>	return reg_in_range_table(addr, xehp_oa_b_counters) ||
> -		reg_in_range_table(addr, gen12_oa_b_counters);
> +		reg_in_range_table(addr, gen12_oa_b_counters) ||
> +		xehp_is_valid_oam_b_counter_addr(perf, addr);
>  }
>
>  static bool gen12_is_valid_mux_addr(struct i915_perf *perf, u32 addr)
> @@ -4846,11 +4940,39 @@ static u32 __num_perf_groups_per_gt(struct intel_gt *gt)
>	enum intel_platform platform = INTEL_INFO(gt->i915)->platform;
>
>	switch (platform) {
> +	case INTEL_METEORLAKE:
> +		return 1;

I don't think we need this, as proposed previously maybe the function
should just unconditionally return 1.

>	default:
>		return 1;
>	}
>  }
>
> +static u32 __oam_engine_group(struct intel_engine_cs *engine)
> +{
> +	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
> +	struct intel_gt *gt = engine->gt;
> +	u32 group = PERF_GROUP_INVALID;
> +
> +	switch (platform) {
> +	case INTEL_METEORLAKE:

Replace here with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))'.

> +		/*
> +		 * There's 1 SAMEDIA gt and 1 OAM per SAMEDIA gt. All media slices
> +		 * within the gt use the same OAM. All MTL SKUs list 1 SA MEDIA.
> +		 */
> +		drm_WARN_ON(&engine->i915->drm,
> +			    engine->gt->type != GT_MEDIA);
> +
> +		group = PERF_GROUP_OAM_SAMEDIA_0;
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	drm_WARN_ON(&gt->i915->drm, group >= __num_perf_groups_per_gt(gt));
> +
> +	return group;
> +}
> +
>  static u32 __oa_engine_group(struct intel_engine_cs *engine)
>  {
>	if (!engine_supports_oa(engine))

As mentioned above for engine_supports_oa, this looks like a duplication of
the checks below and should probably be removed.

> @@ -4860,11 +4982,58 @@ static u32 __oa_engine_group(struct intel_engine_cs *engine)
>	case RENDER_CLASS:
>		return PERF_GROUP_OAG;
>
> +	case VIDEO_DECODE_CLASS:
> +	case VIDEO_ENHANCEMENT_CLASS:
> +		return __oam_engine_group(engine);
> +
>	default:
>		return PERF_GROUP_INVALID;
>	}
>  }
>
> +static struct i915_perf_regs __oam_regs(u32 base)
> +{
> +	return (struct i915_perf_regs) {
> +		base,
> +		GEN12_OAM_HEAD_POINTER(base),
> +		GEN12_OAM_TAIL_POINTER(base),
> +		GEN12_OAM_BUFFER(base),
> +		GEN12_OAM_CONTEXT_CONTROL(base),
> +		GEN12_OAM_CONTROL(base),
> +		GEN12_OAM_DEBUG(base),
> +		GEN12_OAM_STATUS(base),
> +		GEN12_OAM_CONTROL_COUNTER_FORMAT_SHIFT,
> +	};
> +}
> +
> +static struct i915_perf_regs __oag_regs(void)
> +{
> +	return (struct i915_perf_regs) {
> +		0,
> +		GEN12_OAG_OAHEADPTR,
> +		GEN12_OAG_OATAILPTR,
> +		GEN12_OAG_OABUFFER,
> +		GEN12_OAG_OAGLBCTXCTRL,
> +		GEN12_OAG_OACONTROL,
> +		GEN12_OAG_OA_DEBUG,
> +		GEN12_OAG_OASTATUS,
> +		GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT,
> +	};
> +}
> +
> +static void oa_init_regs(struct intel_gt *gt, u32 id)
> +{
> +	struct i915_perf_group *group = &gt->perf.group[id];
> +	struct i915_perf_regs *regs = &group->regs;
> +
> +	if (id == PERF_GROUP_OAG && gt->type != GT_MEDIA)
> +		*regs = __oag_regs();
> +	else if (IS_METEORLAKE(gt->i915))

Replace with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))', OAM
registers are identical in later platforms too. Maybe get rid of drm_WARN
below?

> +		*regs = __oam_regs(mtl_oa_base[id]);
> +	else
> +		drm_WARN(&gt->i915->drm, 1, "Unsupported platform for OA\n");
> +}
> +
>  static void oa_init_groups(struct intel_gt *gt)
>  {
>	int i, num_groups = gt->perf.num_perf_groups;
> @@ -4881,6 +5050,24 @@ static void oa_init_groups(struct intel_gt *gt)
>		g->oa_unit_id = perf->oa_unit_ids++;
>
>		g->gt = gt;
> +		oa_init_regs(gt, i);
> +		g->fw_domains = FORCEWAKE_ALL;
> +		if (i == PERF_GROUP_OAG) {
> +			g->type = TYPE_OAG;
> +
> +			/*
> +			 * Enabling all fw domains for OAG caps the max GT
> +			 * frequency to media FF max. This could be less than
> +			 * what the user sets through the sysfs and perf
> +			 * measurements could be skewed. Since some platforms
> +			 * have separate OAM units to measure media perf, do not
> +			 * enable media fw domains for OAG.
> +			 */
> +			if (HAS_OAM(gt->i915))
> +				g->fw_domains = FORCEWAKE_GT | FORCEWAKE_RENDER;

Is this needed even when media and render are separate tiles, which is the
only case we have in this code right now? For separate tiles setting
FORCEWAKE_ALL should not cap the freq, correct?

If not needed we can get rid of g->fw_domains.

> +		} else {
> +			g->type = TYPE_OAM;

This is wrong, because num_perf_groups is 1. So the type should be assigned
not based on i (which is always 0) but maybe similar to what is done in
oa_init_regs above.

We are escaping because g->type is unused as mentioned below.

> +		}
>	}
>  }
>
> @@ -4970,9 +5157,15 @@ static void oa_init_supported_formats(struct i915_perf *perf)
>		break;
>
>	case INTEL_DG2:
> +		oa_format_add(perf, I915_OAR_FORMAT_A32u40_A4u32_B8_C8);
> +		oa_format_add(perf, I915_OA_FORMAT_A24u40_A14u32_B8_C8);
> +		break;
> +
>	case INTEL_METEORLAKE:
>		oa_format_add(perf, I915_OAR_FORMAT_A32u40_A4u32_B8_C8);
>		oa_format_add(perf, I915_OA_FORMAT_A24u40_A14u32_B8_C8);
> +		oa_format_add(perf, I915_OAM_FORMAT_MPEC8u64_B8_C8);
> +		oa_format_add(perf, I915_OAM_FORMAT_MPEC8u32_B8_C8);
>		break;
>
>	default:
> @@ -5217,8 +5410,10 @@ int i915_perf_ioctl_version(void)
>	 *
>	 * 6: Add DRM_I915_PERF_PROP_OA_ENGINE_CLASS and
>	 *    DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE
> +	 *
> +	 * 7: Add support for video decode and enhancement classes.
>	 */
> -	return 6;
> +	return 7;

Let's figure out if we want to club all this into 6.

>  }
>
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> diff --git a/drivers/gpu/drm/i915/i915_perf_oa_regs.h b/drivers/gpu/drm/i915/i915_perf_oa_regs.h
> index 381d94101610..ba103875e19f 100644
> --- a/drivers/gpu/drm/i915/i915_perf_oa_regs.h
> +++ b/drivers/gpu/drm/i915/i915_perf_oa_regs.h
> @@ -138,4 +138,82 @@
>  #define   GEN12_SQCNT1_PMON_ENABLE		REG_BIT(30)
>  #define   GEN12_SQCNT1_OABPC			REG_BIT(29)
>
> +/* Gen12 OAM unit */
> +#define GEN12_OAM_HEAD_POINTER_OFFSET   (0x1a0)
> +#define  GEN12_OAM_HEAD_POINTER_MASK    0xffffffc0
> +
> +#define GEN12_OAM_TAIL_POINTER_OFFSET   (0x1a4)
> +#define  GEN12_OAM_TAIL_POINTER_MASK    0xffffffc0
> +
> +#define GEN12_OAM_BUFFER_OFFSET         (0x1a8)
> +#define  GEN12_OAM_BUFFER_SIZE_MASK     (0x7)
> +#define  GEN12_OAM_BUFFER_SIZE_SHIFT    (3)
> +#define  GEN12_OAM_BUFFER_MEMORY_SELECT REG_BIT(0) /* 0: PPGTT, 1: GGTT */
> +
> +#define GEN12_OAM_CONTEXT_CONTROL_OFFSET              (0x1bc)
> +#define  GEN12_OAM_CONTEXT_CONTROL_TIMER_PERIOD_SHIFT 2
> +#define  GEN12_OAM_CONTEXT_CONTROL_TIMER_ENABLE       REG_BIT(1)
> +#define  GEN12_OAM_CONTEXT_CONTROL_COUNTER_RESUME     REG_BIT(0)
> +
> +#define GEN12_OAM_CONTROL_OFFSET                (0x194)
> +#define  GEN12_OAM_CONTROL_COUNTER_FORMAT_SHIFT 1
> +#define  GEN12_OAM_CONTROL_COUNTER_ENABLE       REG_BIT(0)
> +
> +#define GEN12_OAM_DEBUG_OFFSET                      (0x198)
> +#define  GEN12_OAM_DEBUG_BUFFER_SIZE_SELECT         REG_BIT(12)
> +#define  GEN12_OAM_DEBUG_INCLUDE_CLK_RATIO          REG_BIT(6)
> +#define  GEN12_OAM_DEBUG_DISABLE_CLK_RATIO_REPORTS  REG_BIT(5)
> +#define  GEN12_OAM_DEBUG_DISABLE_GO_1_0_REPORTS     REG_BIT(2)
> +#define  GEN12_OAM_DEBUG_DISABLE_CTX_SWITCH_REPORTS REG_BIT(1)
> +
> +#define GEN12_OAM_STATUS_OFFSET            (0x19c)
> +#define  GEN12_OAM_STATUS_COUNTER_OVERFLOW REG_BIT(2)
> +#define  GEN12_OAM_STATUS_BUFFER_OVERFLOW  REG_BIT(1)
> +#define  GEN12_OAM_STATUS_REPORT_LOST      REG_BIT(0)
> +
> +#define GEN12_OAM_MMIO_TRG_OFFSET	(0x1d0)
> +
> +#define GEN12_OAM_MMIO_TRG(base) \
> +	_MMIO((base) + GEN12_OAM_MMIO_TRG_OFFSET)
> +
> +#define GEN12_OAM_HEAD_POINTER(base) \
> +	_MMIO((base) + GEN12_OAM_HEAD_POINTER_OFFSET)
> +#define GEN12_OAM_TAIL_POINTER(base) \
> +	_MMIO((base) + GEN12_OAM_TAIL_POINTER_OFFSET)
> +#define GEN12_OAM_BUFFER(base) \
> +	_MMIO((base) + GEN12_OAM_BUFFER_OFFSET)
> +#define GEN12_OAM_CONTEXT_CONTROL(base) \
> +	_MMIO((base) + GEN12_OAM_CONTEXT_CONTROL_OFFSET)
> +#define GEN12_OAM_CONTROL(base) \
> +	_MMIO((base) + GEN12_OAM_CONTROL_OFFSET)
> +#define GEN12_OAM_DEBUG(base) \
> +	_MMIO((base) + GEN12_OAM_DEBUG_OFFSET)
> +#define GEN12_OAM_STATUS(base) \
> +	_MMIO((base) + GEN12_OAM_STATUS_OFFSET)
> +
> +#define GEN12_OAM_CEC0_0_OFFSET		(0x40)
> +#define GEN12_OAM_CEC7_1_OFFSET		(0x7c)
> +#define GEN12_OAM_CEC0_0(base) \
> +	_MMIO((base) + GEN12_OAM_CEC0_0_OFFSET)
> +#define GEN12_OAM_CEC7_1(base) \
> +	_MMIO((base) + GEN12_OAM_CEC7_1_OFFSET)
> +
> +#define GEN12_OAM_STARTTRIG1_OFFSET	(0x00)
> +#define GEN12_OAM_STARTTRIG8_OFFSET	(0x1c)
> +#define GEN12_OAM_STARTTRIG1(base) \
> +	_MMIO((base) + GEN12_OAM_STARTTRIG1_OFFSET)
> +#define GEN12_OAM_STARTTRIG8(base) \
> +	_MMIO((base) + GEN12_OAM_STARTTRIG8_OFFSET)
> +
> +#define GEN12_OAM_REPORTTRIG1_OFFSET	(0x20)
> +#define GEN12_OAM_REPORTTRIG8_OFFSET	(0x3c)
> +#define GEN12_OAM_REPORTTRIG1(base) \
> +	_MMIO((base) + GEN12_OAM_REPORTTRIG1_OFFSET)
> +#define GEN12_OAM_REPORTTRIG8(base) \
> +	_MMIO((base) + GEN12_OAM_REPORTTRIG8_OFFSET)
> +
> +#define GEN12_OAM_PERF_COUNTER_B0_OFFSET	(0x84)
> +#define GEN12_OAM_PERF_COUNTER_B(base, idx) \
> +	_MMIO((base) + GEN12_OAM_PERF_COUNTER_B0_OFFSET + 4 * (idx))
> +
>  #endif /* __INTEL_PERF_OA_REGS__ */
> diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
> index 8ccb0b89d019..5b2c3bab60f8 100644
> --- a/drivers/gpu/drm/i915/i915_perf_types.h
> +++ b/drivers/gpu/drm/i915/i915_perf_types.h
> @@ -20,6 +20,7 @@
>  #include "gt/intel_engine_types.h"
>  #include "gt/intel_sseu.h"
>  #include "i915_reg_defs.h"
> +#include "intel_uncore.h"
>  #include "intel_wakeref.h"
>
>  struct drm_i915_private;
> @@ -33,6 +34,7 @@ struct intel_engine_cs;
>
>  enum {
>	PERF_GROUP_OAG = 0,
> +	PERF_GROUP_OAM_SAMEDIA_0 = 0,
>
>	PERF_GROUP_MAX,
>	PERF_GROUP_INVALID = U32_MAX,
> @@ -43,9 +45,27 @@ enum report_header {
>	HDR_64_BIT,
>  };
>
> +struct i915_perf_regs {
> +	u32 base;
> +	i915_reg_t oa_head_ptr;
> +	i915_reg_t oa_tail_ptr;
> +	i915_reg_t oa_buffer;
> +	i915_reg_t oa_ctx_ctrl;
> +	i915_reg_t oa_ctrl;
> +	i915_reg_t oa_debug;
> +	i915_reg_t oa_status;
> +	u32 oa_ctrl_counter_format_shift;
> +};
> +
> +enum {

enum oa_type?

> +	TYPE_OAG,
> +	TYPE_OAM,
> +};
> +
>  struct i915_oa_format {
>	u32 format;
>	int size;
> +	int type;
>	enum report_header header;
>  };
>
> @@ -317,6 +337,11 @@ struct i915_perf_stream {
>		 * @tail: The last verified tail that can be read by userspace.
>		 */
>		u32 tail;
> +
> +		/**
> +		 * @group: The group object for this OA buffer.
> +		 */
> +		struct i915_perf_group *group;

Isn't this just stream->engine->oa_group, so let's use that instead rather
than duplicating?

>	} oa_buffer;
>
>	/**
> @@ -431,6 +456,21 @@ struct i915_perf_group {
>	 * @engine_mask: A mask of engines using a single OA buffer.
>	 */
>	intel_engine_mask_t engine_mask;
> +
> +	/*
> +	 * @regs: OA buffer register group for programming the OA unit.
> +	 */
> +	struct i915_perf_regs regs;
> +
> +	/*
> +	 * @type: Type of OA buffer, OAM, OAG etc.

s/OA buffer/OA unit/

> +	 */
> +	int type;

Also (incorrectly) assigned but currently unused unless we make the change
to engine_class_supports_oa_format mentioned above. But I think we should
retain and use this.

> +
> +	/*
> +	 * @fw_domains: forcewake domains required for this group.
> +	 */
> +	enum forcewake_domains fw_domains;

Let's see if we need this.

>  };
>
>  struct i915_perf_gt {
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 80bda653d61b..45e218327f44 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -166,6 +166,7 @@ enum intel_ppgtt_type {
>	func(has_mslice_steering); \
>	func(has_oa_bpc_reporting); \
>	func(has_oa_slice_contrib_limits); \
> +	func(has_oam); \
>	func(has_one_eu_per_fuse_bit); \
>	func(has_pxp); \
>	func(has_rc6); \
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index b6922b52d85c..70bfa6530dbc 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -2676,6 +2676,10 @@ enum drm_i915_oa_format {
>	I915_OAR_FORMAT_A32u40_A4u32_B8_C8,
>	I915_OA_FORMAT_A24u40_A14u32_B8_C8,
>
> +	/* MTL OAM */
> +	I915_OAM_FORMAT_MPEC8u64_B8_C8,
> +	I915_OAM_FORMAT_MPEC8u32_B8_C8,
> +
>	I915_OA_FORMAT_MAX	    /* non-ABI */
>  };

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 4/9] drm/i915/perf: Group engines into respective OA groups
  2023-02-22 21:52   ` Dixit, Ashutosh
@ 2023-02-24 17:30     ` Umesh Nerlige Ramappa
  0 siblings, 0 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-24 17:30 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

On Wed, Feb 22, 2023 at 01:52:23PM -0800, Dixit, Ashutosh wrote:
>On Thu, 16 Feb 2023 16:58:45 -0800, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh,
>
>> Now that we may have multiple OA units in a single GT as well as on
>> separate GTs, create an engine group that maps to a single OA unit.
>>
>> v2: (Jani)
>> - Drop warning on ENOMEM
>> - Reorder patch in the series
>>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>>  drivers/gpu/drm/i915/gt/intel_engine_types.h |   4 +
>>  drivers/gpu/drm/i915/gt/intel_sseu.c         |   3 +-
>>  drivers/gpu/drm/i915/i915_perf.c             | 124 +++++++++++++++++--
>>  drivers/gpu/drm/i915/i915_perf_types.h       |  51 +++++++-
>>  4 files changed, 169 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
>> index 4fd54fb8810f..8a8b0dce241b 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
>> @@ -53,6 +53,8 @@ struct intel_gt;
>>  struct intel_ring;
>>  struct intel_uncore;
>>  struct intel_breadcrumbs;
>> +struct intel_engine_cs;
>> +struct i915_perf_group;
>>
>>  typedef u32 intel_engine_mask_t;
>>  #define ALL_ENGINES ((intel_engine_mask_t)~0ul)
>> @@ -603,6 +605,8 @@ struct intel_engine_cs {
>>	} props, defaults;
>>
>>	I915_SELFTEST_DECLARE(struct fault_attr reset_timeout);
>> +
>> +	struct i915_perf_group *oa_group;
>
>I think 'struct i915_oa_unit' is a better name (since it suggests a HW
>entity), but since if we change we'll need to change everywhere so leave as
>is with a comment to the effect that:
>
>	1 OA unit <-> 1 OA buffer <-> 1 perf group
>
>>  };
>>
>>  static inline bool
>> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
>> index 6c6198a257ac..1141f875f5bd 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
>> @@ -6,6 +6,7 @@
>>  #include <linux/string_helpers.h>
>>
>>  #include "i915_drv.h"
>> +#include "i915_perf_types.h"
>>  #include "intel_engine_regs.h"
>>  #include "intel_gt_regs.h"
>>  #include "intel_sseu.h"
>> @@ -677,7 +678,7 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt,
>>	 * If i915/perf is active, we want a stable powergating configuration
>>	 * on the system. Use the configuration pinned by i915/perf.
>>	 */
>> -	if (gt->perf.exclusive_stream)
>> +	if (gt->perf.group && gt->perf.group[PERF_GROUP_OAG].exclusive_stream)
>
>I haven't looked into what this function does, hopefully ok to do this only
>for OAG?

This function builds the value that should be programmed into
PWR_CLK_STATE register which exists only for render.

Will add remaining comments

Thanks,
Umesh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports
  2023-02-21 18:51         ` Dixit, Ashutosh
@ 2023-02-24 19:12           ` Umesh Nerlige Ramappa
  0 siblings, 0 replies; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-24 19:12 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

On Tue, Feb 21, 2023 at 10:51:57AM -0800, Dixit, Ashutosh wrote:
>On Fri, 17 Feb 2023 17:57:02 -0800, Dixit, Ashutosh wrote:
>>
>> On Fri, 17 Feb 2023 16:05:50 -0800, Umesh Nerlige Ramappa wrote:
>> > On Fri, Feb 17, 2023 at 12:58:18PM -0800, Dixit, Ashutosh wrote:
>> > > On Thu, 16 Feb 2023 16:58:48 -0800, Umesh Nerlige Ramappa wrote:
>> > >>
>> > >
>> > > Hi Umesh, couple of nits below.
>> > >
>> > >> Some of the newer OA formats are not powers of 2. For those formats,
>> > >> adjust the hw_tail accordingly when checking for new reports.
>> > >>
>> > >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> > >> ---
>> > >>  drivers/gpu/drm/i915/i915_perf.c | 50 ++++++++++++++++++--------------
>> > >>  1 file changed, 28 insertions(+), 22 deletions(-)
>> > >>
>> > >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> > >> index 9715b964aa1e..d3a1892c93be 100644
>> > >> --- a/drivers/gpu/drm/i915/i915_perf.c
>> > >> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> > >> @@ -542,6 +542,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>> > >>	bool pollin;
>> > >>	u32 hw_tail;
>> > >>	u64 now;
>> > >> +	u32 partial_report_size;
>> > >>
>> > >>	/* We have to consider the (unlikely) possibility that read() errors
>> > >>	 * could result in an OA buffer reset which might reset the head and
>> > >> @@ -551,10 +552,16 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
>> > >>
>> > >>	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
>> > >>
>> > >> -	/* The tail pointer increases in 64 byte increments,
>> > >> -	 * not in report_size steps...
>> > >> +	/* The tail pointer increases in 64 byte increments, whereas report
>> > >> +	 * sizes need not be integral multiples or 64 or powers of 2.
>> > > s/or/of/ ---------------------------------------^
>> > >
>> > > Also I think report sizes can only be multiples of 64, the ones we have
>> > > seen till now definitely are. Also the lower 6 bits of tail pointer are 0.
>> >
>> > Agree, the only addition to the old comment should be that the new reports
>> > may not be powers of 2.
>> >
>> > >
>> > >> +	 * Compute potentially partially landed report in the OA buffer
>> > >>	 */
>> > >> -	hw_tail &= ~(report_size - 1);
>> > >> +	partial_report_size = OA_TAKEN(hw_tail, stream->oa_buffer.tail);
>> > >> +	partial_report_size %= report_size;
>> > >> +
>> > >> +	/* Subtract partial amount off the tail */
>> > >> +	hw_tail = gtt_offset + ((hw_tail - partial_report_size) &
>> > >> +				(stream->oa_buffer.vma->size - 1));
>> > >
>> > > Couple of questions here because OA_TAKEN uses OA_BUFFER_SIZE and the above
>> > > expression uses stream->oa_buffer.vma->size:
>> > >
>> > > 1. Is 'OA_BUFFER_SIZE == stream->oa_buffer.vma->size'? We seem to be using
>> > >   the two interchaneably in the code.
>> >
>> > Yes. I think the code was updated to use vma->size when support for
>> > selecting OA buffer size along with large OA buffers was added, but we
>> > haven't pushed that upstream yet. Since I have cherry-picked patches here,
>> > there is some inconsistency. I would just change this patch to use
>> > OA_BUFFER_SIZE for now.
>> >
>> > > 2. If yes, can we add an assert about this in alloc_oa_buffer?
>> >
>> > If I change to OA_BUFFER_SIZE, then okay to skip assert?
>>
>> Yes.
>>
>> > Do you suspect that the vma size may actually differ from what we
>> > requested?
>>
>> Not sure how shmem objects are allocated but my guess would be that for a
>> nice whole size like 16 M they the vma size will be the same. So ok to just
>> use OA_BUFFER_SIZE in a couple of places in this patch and skip the
>> assert. As long as vma_size >= OA_BUFFER_SIZE we are ok.
>>
>> >
>> > > 3. Can the above expression be changed to:
>> > >
>> > >	hw_tail = gtt_offset + OA_TAKEN(hw_tail, partial_report_size);
>> >
>> > Not if hw_tail has rolled over, but stream->oa_buffer.tail hasn't.
>>
>> Why not, the two expressions are exactly the same? And anyway
>> stream->oa_buffer.tail is already handled in the first OA_TAKEN expression.
>
>Basically, for me OA_TAKEN is a "circular diff" (for a power-of-2 sized
>circular buffer), so anywhere we have these "circular diff" opereations we
>should be able to replace them by OA_TAKEN.

I guess I misread your comment. They are indeed identical. I can change 
that.

Thanks,
Umesh


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf
  2023-02-21 23:53   ` Dixit, Ashutosh
  2023-02-22  0:10     ` Dixit, Ashutosh
@ 2023-02-24 19:37     ` Umesh Nerlige Ramappa
  2023-02-24 20:48       ` Dixit, Ashutosh
  1 sibling, 1 reply; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-24 19:37 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

On Tue, Feb 21, 2023 at 03:53:57PM -0800, Dixit, Ashutosh wrote:
>On Thu, 16 Feb 2023 16:58:49 -0800, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh,
>
>Patch is mostly ok but a few questions below:
>
>> Current implementation of perf defaults to render and configures the
>> default OAG unit. Since there are more OA units on newer hardware, allow
>> user to pass engine class and instance to program specific OA units.
>
>I think we should more clearly say here that the OA unit is identified by
>means of one of the engines (class/instance of that engine) associated with
>that OA unit. The engine -> OA unit mapping is a static mapping depending
>on the particular platform. Something like that.
>
>>
>> UMD specific changes for GPUvis support:
>> https://patchwork.freedesktop.org/patch/522827/?series=114023
>> https://patchwork.freedesktop.org/patch/522822/?series=114023
>> https://patchwork.freedesktop.org/patch/522826/?series=114023
>> https://patchwork.freedesktop.org/patch/522828/?series=114023
>> https://patchwork.freedesktop.org/patch/522816/?series=114023
>> https://patchwork.freedesktop.org/patch/522825/?series=114023
>>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_perf.c | 49 +++++++++++++++++++-------------
>>  include/uapi/drm/i915_drm.h      | 20 +++++++++++++
>>  2 files changed, 49 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index d3a1892c93be..f028df812067 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -4035,40 +4035,29 @@ static int read_properties_unlocked(struct i915_perf *perf,
>>	struct drm_i915_gem_context_param_sseu user_sseu;
>>	u64 __user *uprop = uprops;
>>	bool config_sseu = false;
>> +	u8 class, instance;
>>	u32 i;
>>	int ret;
>>
>>	memset(props, 0, sizeof(struct perf_open_properties));
>>	props->poll_oa_period = DEFAULT_POLL_PERIOD_NS;
>>
>> -	if (!n_props) {
>> -		drm_dbg(&perf->i915->drm,
>> -			"No i915 perf properties given\n");
>> -		return -EINVAL;
>> -	}
>> -
>> -	/* At the moment we only support using i915-perf on the RCS. */
>> -	props->engine = intel_engine_lookup_user(perf->i915,
>> -						 I915_ENGINE_CLASS_RENDER,
>> -						 0);
>> -	if (!props->engine) {
>> -		drm_dbg(&perf->i915->drm,
>> -			"No RENDER-capable engines\n");
>> -		return -EINVAL;
>> -	}
>> -
>>	/* Considering that ID = 0 is reserved and assuming that we don't
>>	 * (currently) expect any configurations to ever specify duplicate
>>	 * values for a particular property ID then the last _PROP_MAX value is
>>	 * one greater than the maximum number of properties we expect to get
>>	 * from userspace.
>>	 */
>> -	if (n_props >= DRM_I915_PERF_PROP_MAX) {
>> +	if (!n_props || n_props >= DRM_I915_PERF_PROP_MAX) {
>>		drm_dbg(&perf->i915->drm,
>> -			"More i915 perf properties specified than exist\n");
>> +			"Invalid no. of i915 perf properties given\n");
>
>Invalid number
>
>>		return -EINVAL;
>>	}
>>
>> +	/* Defaults when class:instance is not passed */
>> +	class = I915_ENGINE_CLASS_RENDER;
>> +	instance = 0;
>> +
>>	for (i = 0; i < n_props; i++) {
>>		u64 oa_period, oa_freq_hz;
>>		u64 id, value;
>> @@ -4189,7 +4178,13 @@ static int read_properties_unlocked(struct i915_perf *perf,
>>			}
>>			props->poll_oa_period = value;
>>			break;
>> -		case DRM_I915_PERF_PROP_MAX:
>> +		case DRM_I915_PERF_PROP_OA_ENGINE_CLASS:
>> +			class = (u8)value;
>> +			break;
>> +		case DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE:
>> +			instance = (u8)value;
>> +			break;
>> +		default:
>>			MISSING_CASE(id);
>>			return -EINVAL;
>>		}
>> @@ -4197,6 +4192,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
>>		uprop += 2;
>>	}
>>
>> +	props->engine = intel_engine_lookup_user(perf->i915, class, instance);
>> +	if (!props->engine) {
>> +		drm_dbg(&perf->i915->drm,
>> +			"OA engine class and instance invalid %d:%d\n",
>> +			class, instance);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (!engine_supports_oa(props->engine))
>> +		return -EINVAL;
>
>Need drm_dbg here too?
>
>> +
>>	if (config_sseu) {
>>		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
>>		if (ret) {
>> @@ -5208,8 +5214,11 @@ int i915_perf_ioctl_version(void)
>>	 *
>>	 * 5: Add DRM_I915_PERF_PROP_POLL_OA_PERIOD parameter that controls the
>>	 *    interval for the hrtimer used to check for OA data.
>> +	 *
>> +	 * 6: Add DRM_I915_PERF_PROP_OA_ENGINE_CLASS and
>> +	 *    DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE
>>	 */
>> -	return 5;
>> +	return 6;
>
>Do we need a separate revision for this? Maybe club it with OAM support
>since that is where this is getting introduced?

It's a separate revision to identify support for multiple GTs, even 
without OAM.
>
>>  }
>>
>>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 8df261c5ab9b..b6922b52d85c 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -2758,6 +2758,26 @@ enum drm_i915_perf_property_id {
>>	 */
>>	DRM_I915_PERF_PROP_POLL_OA_PERIOD,
>>
>> +	/**
>> +	 * In platforms with multiple OA buffers, the engine class instance must
>> +	 * be passed to open a stream to a OA unit corresponding to the engine.
>> +	 * Multiple engines may be mapped to the same OA unit.
>> +	 *
>> +	 * In addition to the class:instance, if a gem context is also passed, then
>> +	 * 1) the report headers of OA reports from other engines are squashed.
>
>Other engines or you mean other contexts?
>
>> +	 * 2) OAR is enabled for the class:instance
>
>For render engine?
>
>Maybe the above comments need to be more crisp since they are in i915_drm.h
>or is it only I who is confused :)
>
>> +	 *
>> +	 * This property is available in perf revision 6.
>> +	 */
>> +	DRM_I915_PERF_PROP_OA_ENGINE_CLASS,
>> +
>> +	/**
>> +	 * This parameter specifies the engine instance.
>> +	 *
>> +	 * This property is available in perf revision 6.
>> +	 */
>> +	DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE,
>> +
>>	DRM_I915_PERF_PROP_MAX /* non-ABI */
>>  };
>>
>> --
>> 2.36.1
>>
>
>Thanks.
>--
>Ashutosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf
  2023-02-24 19:37     ` Umesh Nerlige Ramappa
@ 2023-02-24 20:48       ` Dixit, Ashutosh
  0 siblings, 0 replies; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-24 20:48 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Fri, 24 Feb 2023 11:37:01 -0800, Umesh Nerlige Ramappa wrote:
>
> On Tue, Feb 21, 2023 at 03:53:57PM -0800, Dixit, Ashutosh wrote:
> > On Thu, 16 Feb 2023 16:58:49 -0800, Umesh Nerlige Ramappa wrote:
> >>
> >
> > Hi Umesh,
> >
> > Patch is mostly ok but a few questions below:
> >
> >> Current implementation of perf defaults to render and configures the
> >> default OAG unit. Since there are more OA units on newer hardware, allow
> >> user to pass engine class and instance to program specific OA units.
> >
> > I think we should more clearly say here that the OA unit is identified by
> > means of one of the engines (class/instance of that engine) associated with
> > that OA unit. The engine -> OA unit mapping is a static mapping depending
> > on the particular platform. Something like that.
> >
> >>
> >> UMD specific changes for GPUvis support:
> >> https://patchwork.freedesktop.org/patch/522827/?series=114023
> >> https://patchwork.freedesktop.org/patch/522822/?series=114023
> >> https://patchwork.freedesktop.org/patch/522826/?series=114023
> >> https://patchwork.freedesktop.org/patch/522828/?series=114023
> >> https://patchwork.freedesktop.org/patch/522816/?series=114023
> >> https://patchwork.freedesktop.org/patch/522825/?series=114023
> >>
> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_perf.c | 49 +++++++++++++++++++-------------
> >>  include/uapi/drm/i915_drm.h      | 20 +++++++++++++
> >>  2 files changed, 49 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> >> index d3a1892c93be..f028df812067 100644
> >> --- a/drivers/gpu/drm/i915/i915_perf.c
> >> +++ b/drivers/gpu/drm/i915/i915_perf.c
> >> @@ -4035,40 +4035,29 @@ static int read_properties_unlocked(struct i915_perf *perf,
> >>	struct drm_i915_gem_context_param_sseu user_sseu;
> >>	u64 __user *uprop = uprops;
> >>	bool config_sseu = false;
> >> +	u8 class, instance;
> >>	u32 i;
> >>	int ret;
> >>
> >>	memset(props, 0, sizeof(struct perf_open_properties));
> >>	props->poll_oa_period = DEFAULT_POLL_PERIOD_NS;
> >>
> >> -	if (!n_props) {
> >> -		drm_dbg(&perf->i915->drm,
> >> -			"No i915 perf properties given\n");
> >> -		return -EINVAL;
> >> -	}
> >> -
> >> -	/* At the moment we only support using i915-perf on the RCS. */
> >> -	props->engine = intel_engine_lookup_user(perf->i915,
> >> -						 I915_ENGINE_CLASS_RENDER,
> >> -						 0);
> >> -	if (!props->engine) {
> >> -		drm_dbg(&perf->i915->drm,
> >> -			"No RENDER-capable engines\n");
> >> -		return -EINVAL;
> >> -	}
> >> -
> >>	/* Considering that ID = 0 is reserved and assuming that we don't
> >>	 * (currently) expect any configurations to ever specify duplicate
> >>	 * values for a particular property ID then the last _PROP_MAX value is
> >>	 * one greater than the maximum number of properties we expect to get
> >>	 * from userspace.
> >>	 */
> >> -	if (n_props >= DRM_I915_PERF_PROP_MAX) {
> >> +	if (!n_props || n_props >= DRM_I915_PERF_PROP_MAX) {
> >>		drm_dbg(&perf->i915->drm,
> >> -			"More i915 perf properties specified than exist\n");
> >> +			"Invalid no. of i915 perf properties given\n");
> >
> > Invalid number
> >
> >>		return -EINVAL;
> >>	}
> >>
> >> +	/* Defaults when class:instance is not passed */
> >> +	class = I915_ENGINE_CLASS_RENDER;
> >> +	instance = 0;
> >> +
> >>	for (i = 0; i < n_props; i++) {
> >>		u64 oa_period, oa_freq_hz;
> >>		u64 id, value;
> >> @@ -4189,7 +4178,13 @@ static int read_properties_unlocked(struct i915_perf *perf,
> >>			}
> >>			props->poll_oa_period = value;
> >>			break;
> >> -		case DRM_I915_PERF_PROP_MAX:
> >> +		case DRM_I915_PERF_PROP_OA_ENGINE_CLASS:
> >> +			class = (u8)value;
> >> +			break;
> >> +		case DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE:
> >> +			instance = (u8)value;
> >> +			break;
> >> +		default:
> >>			MISSING_CASE(id);
> >>			return -EINVAL;
> >>		}
> >> @@ -4197,6 +4192,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
> >>		uprop += 2;
> >>	}
> >>
> >> +	props->engine = intel_engine_lookup_user(perf->i915, class, instance);
> >> +	if (!props->engine) {
> >> +		drm_dbg(&perf->i915->drm,
> >> +			"OA engine class and instance invalid %d:%d\n",
> >> +			class, instance);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	if (!engine_supports_oa(props->engine))
> >> +		return -EINVAL;
> >
> > Need drm_dbg here too?
> >
> >> +
> >>	if (config_sseu) {
> >>		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
> >>		if (ret) {
> >> @@ -5208,8 +5214,11 @@ int i915_perf_ioctl_version(void)
> >>	 *
> >>	 * 5: Add DRM_I915_PERF_PROP_POLL_OA_PERIOD parameter that controls the
> >>	 *    interval for the hrtimer used to check for OA data.
> >> +	 *
> >> +	 * 6: Add DRM_I915_PERF_PROP_OA_ENGINE_CLASS and
> >> +	 *    DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE
> >>	 */
> >> -	return 5;
> >> +	return 6;
> >
> > Do we need a separate revision for this? Maybe club it with OAM support
> > since that is where this is getting introduced?
>
> It's a separate revision to identify support for multiple GTs, even without
> OAM.

I was thinking that first there are no such products (xehpsdv was, but is
now dead I believe) and even if it there were the series will be merged
into a single kernel version so a single version would suffice.

Maybe you mean that each patch which touches the uapi should up the OA
version?

In any case, since it is just a version number, no issues, either way is ok
with me.

Thanks.
--
Ashutosh

> >
> >>  }
> >>
> >>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> >> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> >> index 8df261c5ab9b..b6922b52d85c 100644
> >> --- a/include/uapi/drm/i915_drm.h
> >> +++ b/include/uapi/drm/i915_drm.h
> >> @@ -2758,6 +2758,26 @@ enum drm_i915_perf_property_id {
> >>	 */
> >>	DRM_I915_PERF_PROP_POLL_OA_PERIOD,
> >>
> >> +	/**
> >> +	 * In platforms with multiple OA buffers, the engine class instance must
> >> +	 * be passed to open a stream to a OA unit corresponding to the engine.
> >> +	 * Multiple engines may be mapped to the same OA unit.
> >> +	 *
> >> +	 * In addition to the class:instance, if a gem context is also passed, then
> >> +	 * 1) the report headers of OA reports from other engines are squashed.
> >
> > Other engines or you mean other contexts?
> >
> >> +	 * 2) OAR is enabled for the class:instance
> >
> > For render engine?
> >
> > Maybe the above comments need to be more crisp since they are in i915_drm.h
> > or is it only I who is confused :)
> >
> >> +	 *
> >> +	 * This property is available in perf revision 6.
> >> +	 */
> >> +	DRM_I915_PERF_PROP_OA_ENGINE_CLASS,
> >> +
> >> +	/**
> >> +	 * This parameter specifies the engine instance.
> >> +	 *
> >> +	 * This property is available in perf revision 6.
> >> +	 */
> >> +	DRM_I915_PERF_PROP_OA_ENGINE_INSTANCE,
> >> +
> >>	DRM_I915_PERF_PROP_MAX /* non-ABI */
> >>  };
> >>
> >> --
> >> 2.36.1
> >>
> >
> > Thanks.
> > --
> > Ashutosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units
  2023-02-23 20:05   ` Dixit, Ashutosh
@ 2023-02-25  0:58     ` Umesh Nerlige Ramappa
  2023-02-25  3:58       ` Dixit, Ashutosh
  0 siblings, 1 reply; 33+ messages in thread
From: Umesh Nerlige Ramappa @ 2023-02-25  0:58 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

On Thu, Feb 23, 2023 at 12:05:02PM -0800, Dixit, Ashutosh wrote:
>On Thu, 16 Feb 2023 16:58:50 -0800, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh,
>
>> MTL introduces additional OA units dedicated to media use cases. Add
>> support for programming these OA units by passing the media engine class
>> and instance parameters.
>>
>> UMD specific changes for GPUvis support:
>> https://patchwork.freedesktop.org/patch/522827/?series=114023
>> https://patchwork.freedesktop.org/patch/522822/?series=114023
>> https://patchwork.freedesktop.org/patch/522826/?series=114023
>> https://patchwork.freedesktop.org/patch/522828/?series=114023
>> https://patchwork.freedesktop.org/patch/522816/?series=114023
>> https://patchwork.freedesktop.org/patch/522825/?series=114023
>
>General comment about the patch in case I miss something out, as I've
>mentioned previously in general let's try to replace INTEL_METEORLAKE and
>IS_METEORLAKE checks in the patch with:
>
>	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
>
>So that we don't have to enumerate each platform individually later.

Hmm, I recall that you had already commented about this at some point, 
sorry I missed that. Do you suggest I add this change in places outside 
this patch as well?

>
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h          |   2 +
>>  drivers/gpu/drm/i915/i915_pci.c          |   1 +
>>  drivers/gpu/drm/i915/i915_perf.c         | 247 ++++++++++++++++++++---
>>  drivers/gpu/drm/i915/i915_perf_oa_regs.h |  78 +++++++
>>  drivers/gpu/drm/i915/i915_perf_types.h   |  40 ++++
>>  drivers/gpu/drm/i915/intel_device_info.h |   1 +
>>  include/uapi/drm/i915_drm.h              |   4 +
>>  7 files changed, 347 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 0393273faa09..f3cacbf41c86 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -856,6 +856,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>>	(INTEL_INFO(dev_priv)->has_oa_bpc_reporting)
>>  #define HAS_OA_SLICE_CONTRIB_LIMITS(dev_priv) \
>>	(INTEL_INFO(dev_priv)->has_oa_slice_contrib_limits)
>> +#define HAS_OAM(dev_priv) \
>> +	(INTEL_INFO(dev_priv)->has_oam)
>>
>>  /*
>>   * Set this flag, when platform requires 64K GTT page sizes or larger for
>> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
>> index a8d942b16223..621730b6551c 100644
>> --- a/drivers/gpu/drm/i915/i915_pci.c
>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>> @@ -1028,6 +1028,7 @@ static const struct intel_device_info adl_p_info = {
>>	.has_mslice_steering = 1, \
>>	.has_oa_bpc_reporting = 1, \
>>	.has_oa_slice_contrib_limits = 1, \
>> +	.has_oam = 1, \
>>	.has_rc6 = 1, \
>>	.has_reset_engine = 1, \
>>	.has_rps = 1, \
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index f028df812067..a57690f4c531 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -192,6 +192,7 @@
>>   */
>>
>>  #include <linux/anon_inodes.h>
>> +#include <linux/nospec.h>
>>  #include <linux/sizes.h>
>>  #include <linux/uuid.h>
>>
>> @@ -326,6 +327,13 @@ static const struct i915_oa_format oa_formats[I915_OA_FORMAT_MAX] = {
>>	[I915_OA_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 },
>>	[I915_OAR_FORMAT_A32u40_A4u32_B8_C8]    = { 5, 256 },
>>	[I915_OA_FORMAT_A24u40_A14u32_B8_C8]    = { 5, 256 },
>> +	[I915_OAM_FORMAT_MPEC8u64_B8_C8]	= { 1, 192, TYPE_OAM, HDR_64_BIT },
>> +	[I915_OAM_FORMAT_MPEC8u32_B8_C8]	= { 2, 128, TYPE_OAM, HDR_64_BIT },
>> +};
>> +
>> +/* PERF_GROUP_OAG is unused for oa_base, drop it for mtl */
>
>What does this comment mean?

There are multiple OAM units and the base for each is used to calculate 
the OA regs mmio address. OAG is just one unit with the same addresses 
for the regs, so we don't use this array that initializes the bases for 
OA units. Maybe I will add this in the comment here.

>
>> +static const u32 mtl_oa_base[] = {
>> +	[PERF_GROUP_OAM_SAMEDIA_0] = 0x393000,
>>  };
>>
>>  #define SAMPLE_OA_REPORT      (1<<0)
>> @@ -418,11 +426,17 @@ static void free_oa_config_bo(struct i915_oa_config_bo *oa_bo)
>>	kfree(oa_bo);
>>  }
>>
>> +static inline const
>> +struct i915_perf_regs *__oa_regs(struct i915_perf_stream *stream)
>> +{
>> +	return &stream->oa_buffer.group->regs;
>
>Should just use stream->engine->oa_group->regs, see near the bottom.
>
>> +}
>> +
>>  static u32 gen12_oa_hw_tail_read(struct i915_perf_stream *stream)
>>  {
>>	struct intel_uncore *uncore = stream->uncore;
>>
>> -	return intel_uncore_read(uncore, GEN12_OAG_OATAILPTR) &
>> +	return intel_uncore_read(uncore, __oa_regs(stream)->oa_tail_ptr) &
>>	       GEN12_OAG_OATAILPTR_MASK;
>>  }
>>
>> @@ -886,7 +900,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
>>		i915_reg_t oaheadptr;
>>
>>		oaheadptr = GRAPHICS_VER(stream->perf->i915) == 12 ?
>
>>= 12 ?
>
>> -			    GEN12_OAG_OAHEADPTR : GEN8_OAHEADPTR;
>> +			    __oa_regs(stream)->oa_head_ptr :
>> +			    GEN8_OAHEADPTR;
>>
>>		spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>>
>> @@ -939,7 +954,8 @@ static int gen8_oa_read(struct i915_perf_stream *stream,
>>		return -EIO;
>>
>>	oastatus_reg = GRAPHICS_VER(stream->perf->i915) == 12 ?
>
>>= 12 ?
>
>> -		       GEN12_OAG_OASTATUS : GEN8_OASTATUS;
>> +		       __oa_regs(stream)->oa_status :
>> +		       GEN8_OASTATUS;
>>
>>	oastatus = intel_uncore_read(uncore, oastatus_reg);
>>
>> @@ -1643,16 +1659,46 @@ free_noa_wait(struct i915_perf_stream *stream)
>>	i915_vma_unpin_and_release(&stream->noa_wait, 0);
>>  }
>>
>> +/*
>> + * intel_engine_lookup_user ensures that most of engine specific checks are
>> + * taken care of, however, we can run into a case where the OA unit catering to
>> + * the engine passed by the user is disabled for some reason. In such cases,
>> + * ensure oa unit corresponding to an engine is functional. If there are no
>> + * engines in the group, the unit is disabled.
>> + */
>> +static bool oa_unit_functional(const struct intel_engine_cs *engine)
>> +{
>> +	return engine->oa_group && engine->oa_group->num_engines;
>> +}
>> +
>>  static bool engine_supports_oa(const struct intel_engine_cs *engine)
>>  {
>>	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
>>
>>	switch (platform) {
>> +	case INTEL_METEORLAKE:
>> +		return engine->class == RENDER_CLASS ||
>> +		       ((engine->class == VIDEO_DECODE_CLASS ||
>> +			 engine->class == VIDEO_ENHANCEMENT_CLASS) &&
>> +			engine->gt->type == GT_MEDIA);
>>	default:
>>		return engine->class == RENDER_CLASS;
>>	}
>
>As mentioned in a previous patch, this could just be:
>
>	return engine->oa_group;
>
>Because all these checks have already been done when the perf groups were
>initialized so let's use that, as is done for oa_unit_functional.
>
>Though, caution, to return engine->oa_group we'd have to remove
>engine_supports_oa from __oa_engine_group, since engine->oa_group is not
>yet assigned there. But I think the engine_supports_oa check in
>__oa_engine_group is a duplication and should be removed.
>
>>  }
>>
>> +static bool engine_class_supports_oa_format(struct intel_engine_cs *engine, int type)
>> +{
>> +	switch (engine->class) {
>> +	case RENDER_CLASS:
>> +		return type == TYPE_OAG;
>> +	case VIDEO_DECODE_CLASS:
>> +	case VIDEO_ENHANCEMENT_CLASS:
>> +		return type == TYPE_OAM;
>> +	default:
>> +		return false;
>> +	}
>> +}
>> +
>
>Again, how about:
>
>	return engine->oa_group && engine->oa_group->type == type;
>
>Otherwise as mentioned below oa_group->type is unused and also incorrectly
>assigned at present. The format type and group types are the same
>(TYPE_OAG/TYPE_OAM). Can name the function engine_supports_oa_format.
>
>>  static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
>>  {
>>	struct i915_perf *perf = stream->perf;
>> @@ -1680,7 +1726,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
>>		drm_WARN_ON(&gt->i915->drm,
>>			    intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc));
>>
>> -	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
>> +	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
>>	intel_engine_pm_put(stream->engine);
>>
>>	if (stream->ctx)
>> @@ -1804,8 +1850,8 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
>>
>>	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>>
>> -	intel_uncore_write(uncore, GEN12_OAG_OASTATUS, 0);
>> -	intel_uncore_write(uncore, GEN12_OAG_OAHEADPTR,
>> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_status, 0);
>> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_head_ptr,
>>			   gtt_offset & GEN12_OAG_OAHEADPTR_MASK);
>>	stream->oa_buffer.head = gtt_offset;
>>
>> @@ -1817,9 +1863,9 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
>>	 *  to enable proper functionality of the overflow
>>	 *  bit."
>>	 */
>> -	intel_uncore_write(uncore, GEN12_OAG_OABUFFER, gtt_offset |
>> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_buffer, gtt_offset |
>>			   OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT);
>> -	intel_uncore_write(uncore, GEN12_OAG_OATAILPTR,
>> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_tail_ptr,
>>			   gtt_offset & GEN12_OAG_OATAILPTR_MASK);
>>
>>	/* Mark that we need updated tail pointers to read from... */
>> @@ -2579,7 +2625,8 @@ gen8_modify_self(struct intel_context *ce,
>>	return err;
>>  }
>>
>> -static int gen8_configure_context(struct i915_gem_context *ctx,
>> +static int gen8_configure_context(struct i915_perf_stream *stream,
>> +				  struct i915_gem_context *ctx,
>>				  struct flex *flex, unsigned int count)
>>  {
>>	struct i915_gem_engines_iter it;
>> @@ -2589,7 +2636,8 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
>>	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
>>		GEM_BUG_ON(ce == ce->engine->kernel_context);
>>
>> -		if (!engine_supports_oa(ce->engine))
>> +		if (!engine_supports_oa(ce->engine) ||
>> +		    ce->engine->class != stream->engine->class)
>>			continue;
>>
>>		/* Otherwise OA settings will be set upon first use */
>> @@ -2720,7 +2768,7 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
>>
>>		spin_unlock(&i915->gem.contexts.lock);
>>
>> -		err = gen8_configure_context(ctx, regs, num_regs);
>> +		err = gen8_configure_context(stream, ctx, regs, num_regs);
>>		if (err) {
>>			i915_gem_context_put(ctx);
>>			return err;
>> @@ -2740,7 +2788,8 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
>>	for_each_uabi_engine(engine, i915) {
>>		struct intel_context *ce = engine->kernel_context;
>>
>> -		if (!engine_supports_oa(ce->engine))
>> +		if (!engine_supports_oa(ce->engine) ||
>> +		    ce->engine->class != stream->engine->class)
>>			continue;
>>
>>		regs[0].value = intel_sseu_make_rpcs(engine->gt, &ce->sseu);
>> @@ -2765,6 +2814,9 @@ gen12_configure_all_contexts(struct i915_perf_stream *stream,
>>		},
>>	};
>>
>> +	if (stream->engine->class != RENDER_CLASS)
>> +		return 0;
>
>OK, this is for render, nothing equivalent needed for media?

Media engines decided not to have anything configured in the CS 
contexts, rather everything is saved/restored in power context 
transitions, so nothing to be done here.

>
>> +
>>	return oa_configure_all_contexts(stream,
>>					 regs, ARRAY_SIZE(regs),
>>					 active);
>> @@ -2894,7 +2946,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
>>				   _MASKED_BIT_ENABLE(GEN12_DISABLE_DOP_GATING));
>>	}
>>
>> -	intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG,
>> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_debug,
>>			   /* Disable clk ratio reports, like previous Gens. */
>>			   _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS |
>>					      GEN12_OAG_OA_DEBUG_INCLUDE_CLK_RATIO) |
>> @@ -2904,7 +2956,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
>>			    */
>>			   oag_report_ctx_switches(stream));
>>
>> -	intel_uncore_write(uncore, GEN12_OAG_OAGLBCTXCTRL, periodic ?
>> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctx_ctrl, periodic ?
>>			   (GEN12_OAG_OAGLBCTXCTRL_COUNTER_RESUME |
>>			    GEN12_OAG_OAGLBCTXCTRL_TIMER_ENABLE |
>>			    (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT))
>> @@ -3058,8 +3110,8 @@ static void gen8_oa_enable(struct i915_perf_stream *stream)
>>
>>  static void gen12_oa_enable(struct i915_perf_stream *stream)
>>  {
>> -	struct intel_uncore *uncore = stream->uncore;
>> -	u32 report_format = stream->oa_buffer.format->format;
>> +	const struct i915_perf_regs *regs;
>> +	u32 val;
>>
>>	/*
>>	 * If we don't want OA reports from the OA buffer, then we don't even
>> @@ -3070,9 +3122,11 @@ static void gen12_oa_enable(struct i915_perf_stream *stream)
>>
>>	gen12_init_oa_buffer(stream);
>>
>> -	intel_uncore_write(uncore, GEN12_OAG_OACONTROL,
>> -			   (report_format << GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT) |
>> -			   GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE);
>> +	regs = __oa_regs(stream);
>> +	val = (stream->oa_buffer.format->format << regs->oa_ctrl_counter_format_shift) |
>> +	      GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE;
>> +
>> +	intel_uncore_write(stream->uncore, regs->oa_ctrl, val);
>>  }
>>
>>  /**
>> @@ -3124,9 +3178,9 @@ static void gen12_oa_disable(struct i915_perf_stream *stream)
>>  {
>>	struct intel_uncore *uncore = stream->uncore;
>>
>> -	intel_uncore_write(uncore, GEN12_OAG_OACONTROL, 0);
>> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctrl, 0);
>>	if (intel_wait_for_register(uncore,
>> -				    GEN12_OAG_OACONTROL,
>> +				    __oa_regs(stream)->oa_ctrl,
>>				    GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE, 0,
>>				    50))
>>		drm_err(&stream->perf->i915->drm,
>> @@ -3329,6 +3383,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>>
>>	stream->sample_size = sizeof(struct drm_i915_perf_record_header);
>>
>> +	stream->oa_buffer.group = g;
>
>Should just use stream->engine->oa_group, see near the bottom.
>
>>	stream->oa_buffer.format = &perf->oa_formats[props->oa_format];
>>	if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format->size == 0))
>>		return -EINVAL;
>> @@ -3379,7 +3434,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>>	 *   references will effectively disable RC6.
>>	 */
>>	intel_engine_pm_get(stream->engine);
>> -	intel_uncore_forcewake_get(stream->uncore, FORCEWAKE_ALL);
>> +	intel_uncore_forcewake_get(stream->uncore, g->fw_domains);
>>
>>	/*
>>	 * Wa_16011777198:dg2: GuC resets render as part of the Wa. This causes
>> @@ -3440,7 +3495,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>>		intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc);
>>
>>  err_gucrc:
>> -	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
>> +	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
>>	intel_engine_pm_put(stream->engine);
>>
>>	free_oa_configs(stream);
>> @@ -4033,6 +4088,7 @@ static int read_properties_unlocked(struct i915_perf *perf,
>>				    struct perf_open_properties *props)
>>  {
>>	struct drm_i915_gem_context_param_sseu user_sseu;
>> +	const struct i915_oa_format *f;
>>	u64 __user *uprop = uprops;
>>	bool config_sseu = false;
>>	u8 class, instance;
>> @@ -4203,6 +4259,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
>>	if (!engine_supports_oa(props->engine))
>>		return -EINVAL;
>>
>> +	if (!oa_unit_functional(props->engine))
>> +		return -ENODEV;
>> +
>> +	i = array_index_nospec(props->oa_format, I915_OA_FORMAT_MAX);
>
>Why do we need this (something to do with speculation)? Can just do
>'&perf->oa_formats[props->oa_format]' below? The format passed in has
>already been checked in the switch statement above.

Traced it back to "smatch cleanups" commit in rebase history. Something 
to do with static analysis. If not a major concern, I would leave it as 
is.

>
>> +	f = &perf->oa_formats[i];
>> +	if (!engine_class_supports_oa_format(props->engine, f->type)) {
>> +		DRM_DEBUG("Invalid OA format %d for class %d\n",
>> +			  f->type, props->engine->class);
>> +		return -EINVAL;
>> +	}
>> +
>>	if (config_sseu) {
>>		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
>>		if (ret) {
>> @@ -4383,6 +4450,14 @@ static const struct i915_range gen12_oa_b_counters[] = {
>>	{}
>>  };
>>
>> +static const struct i915_range mtl_oam_b_counters[] = {
>> +	{ .start = 0x393000, .end = 0x39301c },	/* GEN12_OAM_STARTTRIG1[1-8] */
>> +	{ .start = 0x393020, .end = 0x39303c },	/* GEN12_OAM_REPORTTRIG1[1-8] */
>> +	{ .start = 0x393040, .end = 0x39307c },	/* GEN12_OAM_CEC[0-7][0-1] */
>> +	{ .start = 0x393200, .end = 0x39323C },	/* MPES[0-7] */
>> +	{}
>> +};
>> +
>>  static const struct i915_range xehp_oa_b_counters[] = {
>>	{ .start = 0xdc48, .end = 0xdc48 },	/* OAA_ENABLE_REG */
>>	{ .start = 0xdd00, .end = 0xdd48 },	/* OAG_LCE0_0 - OAA_LENABLE_REG */
>> @@ -4429,13 +4504,16 @@ static const struct i915_range gen12_oa_mux_regs[] = {
>>
>>  /*
>>   * Ref: 14010536224:
>> - * 0x20cc is repurposed on MTL, so use a separate array for MTL.
>> + * 0x20cc is repurposed on MTL, so use a separate array for MTL. Also add the
>> + * MPES/MPEC registers.
>
>MPES/MPEC registers are added above now, not here so maybe get rid of the
>comment change above?
>
>>   */
>>  static const struct i915_range mtl_oa_mux_regs[] = {
>>	{ .start = 0x0d00, .end = 0x0d04 },	/* RPM_CONFIG[0-1] */
>>	{ .start = 0x0d0c, .end = 0x0d2c },	/* NOA_CONFIG[0-8] */
>>	{ .start = 0x9840, .end = 0x9840 },	/* GDT_CHICKEN_BITS */
>>	{ .start = 0x9884, .end = 0x9888 },	/* NOA_WRITE */
>> +	{ .start = 0x38d100, .end = 0x38d114},	/* VISACTL */
>> +	{}
>>  };
>>
>>  static bool gen7_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
>> @@ -4473,10 +4551,26 @@ static bool gen12_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
>>	return reg_in_range_table(addr, gen12_oa_b_counters);
>>  }
>>
>> +static bool xehp_is_valid_oam_b_counter_addr(struct i915_perf *perf, u32 addr)
>> +{
>> +	enum intel_platform platform = INTEL_INFO(perf->i915)->platform;
>> +
>> +	if (!HAS_OAM(perf->i915))
>> +		return false;
>> +
>> +	switch (platform) {
>> +	case INTEL_METEORLAKE:
>> +		return reg_in_range_table(addr, mtl_oam_b_counters);
>> +	default:
>> +		return false;
>> +	}
>
>Replace with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))', registers
>are identical in later platforms too.
>
>Should the function prefix be xehp or mtl? Don't see xehp in bspec,
>probably xehp is discontinued.
>
>> +}
>> +
>>  static bool xehp_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
>>  {
>>	return reg_in_range_table(addr, xehp_oa_b_counters) ||
>> -		reg_in_range_table(addr, gen12_oa_b_counters);
>> +		reg_in_range_table(addr, gen12_oa_b_counters) ||
>> +		xehp_is_valid_oam_b_counter_addr(perf, addr);
>>  }
>>
>>  static bool gen12_is_valid_mux_addr(struct i915_perf *perf, u32 addr)
>> @@ -4846,11 +4940,39 @@ static u32 __num_perf_groups_per_gt(struct intel_gt *gt)
>>	enum intel_platform platform = INTEL_INFO(gt->i915)->platform;
>>
>>	switch (platform) {
>> +	case INTEL_METEORLAKE:
>> +		return 1;
>
>I don't think we need this, as proposed previously maybe the function
>should just unconditionally return 1.
>
>>	default:
>>		return 1;
>>	}
>>  }
>>
>> +static u32 __oam_engine_group(struct intel_engine_cs *engine)
>> +{
>> +	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
>> +	struct intel_gt *gt = engine->gt;
>> +	u32 group = PERF_GROUP_INVALID;
>> +
>> +	switch (platform) {
>> +	case INTEL_METEORLAKE:
>
>Replace here with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))'.
>
>> +		/*
>> +		 * There's 1 SAMEDIA gt and 1 OAM per SAMEDIA gt. All media slices
>> +		 * within the gt use the same OAM. All MTL SKUs list 1 SA MEDIA.
>> +		 */
>> +		drm_WARN_ON(&engine->i915->drm,
>> +			    engine->gt->type != GT_MEDIA);
>> +
>> +		group = PERF_GROUP_OAM_SAMEDIA_0;
>> +		break;
>> +	default:
>> +		break;
>> +	}
>> +
>> +	drm_WARN_ON(&gt->i915->drm, group >= __num_perf_groups_per_gt(gt));
>> +
>> +	return group;
>> +}
>> +
>>  static u32 __oa_engine_group(struct intel_engine_cs *engine)
>>  {
>>	if (!engine_supports_oa(engine))
>
>As mentioned above for engine_supports_oa, this looks like a duplication of
>the checks below and should probably be removed.

I recall I added this because on older platforms, we were seeing a 
warnON splat from __oam_engine_group because those platforms had a media 
engine. Ideally, I would implement all your inputs in this patch and 
drop the warnON from __oam_engine_group. Is that okay?

>
>> @@ -4860,11 +4982,58 @@ static u32 __oa_engine_group(struct intel_engine_cs *engine)
>>	case RENDER_CLASS:
>>		return PERF_GROUP_OAG;
>>
>> +	case VIDEO_DECODE_CLASS:
>> +	case VIDEO_ENHANCEMENT_CLASS:
>> +		return __oam_engine_group(engine);
>> +
>>	default:
>>		return PERF_GROUP_INVALID;
>>	}
>>  }
>>
>> +static struct i915_perf_regs __oam_regs(u32 base)
>> +{
>> +	return (struct i915_perf_regs) {
>> +		base,
>> +		GEN12_OAM_HEAD_POINTER(base),
>> +		GEN12_OAM_TAIL_POINTER(base),
>> +		GEN12_OAM_BUFFER(base),
>> +		GEN12_OAM_CONTEXT_CONTROL(base),
>> +		GEN12_OAM_CONTROL(base),
>> +		GEN12_OAM_DEBUG(base),
>> +		GEN12_OAM_STATUS(base),
>> +		GEN12_OAM_CONTROL_COUNTER_FORMAT_SHIFT,
>> +	};
>> +}
>> +
>> +static struct i915_perf_regs __oag_regs(void)
>> +{
>> +	return (struct i915_perf_regs) {
>> +		0,
>> +		GEN12_OAG_OAHEADPTR,
>> +		GEN12_OAG_OATAILPTR,
>> +		GEN12_OAG_OABUFFER,
>> +		GEN12_OAG_OAGLBCTXCTRL,
>> +		GEN12_OAG_OACONTROL,
>> +		GEN12_OAG_OA_DEBUG,
>> +		GEN12_OAG_OASTATUS,
>> +		GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT,
>> +	};
>> +}
>> +
>> +static void oa_init_regs(struct intel_gt *gt, u32 id)
>> +{
>> +	struct i915_perf_group *group = &gt->perf.group[id];
>> +	struct i915_perf_regs *regs = &group->regs;
>> +
>> +	if (id == PERF_GROUP_OAG && gt->type != GT_MEDIA)
>> +		*regs = __oag_regs();
>> +	else if (IS_METEORLAKE(gt->i915))
>
>Replace with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))', OAM
>registers are identical in later platforms too. Maybe get rid of drm_WARN
>below?
>
>> +		*regs = __oam_regs(mtl_oa_base[id]);
>> +	else
>> +		drm_WARN(&gt->i915->drm, 1, "Unsupported platform for OA\n");
>> +}
>> +
>>  static void oa_init_groups(struct intel_gt *gt)
>>  {
>>	int i, num_groups = gt->perf.num_perf_groups;
>> @@ -4881,6 +5050,24 @@ static void oa_init_groups(struct intel_gt *gt)
>>		g->oa_unit_id = perf->oa_unit_ids++;
>>
>>		g->gt = gt;
>> +		oa_init_regs(gt, i);
>> +		g->fw_domains = FORCEWAKE_ALL;
>> +		if (i == PERF_GROUP_OAG) {
>> +			g->type = TYPE_OAG;
>> +
>> +			/*
>> +			 * Enabling all fw domains for OAG caps the max GT
>> +			 * frequency to media FF max. This could be less than
>> +			 * what the user sets through the sysfs and perf
>> +			 * measurements could be skewed. Since some platforms
>> +			 * have separate OAM units to measure media perf, do not
>> +			 * enable media fw domains for OAG.
>> +			 */
>> +			if (HAS_OAM(gt->i915))
>> +				g->fw_domains = FORCEWAKE_GT | FORCEWAKE_RENDER;
>
>Is this needed even when media and render are separate tiles, which is the
>only case we have in this code right now? For separate tiles setting
>FORCEWAKE_ALL should not cap the freq, correct?
>
>If not needed we can get rid of g->fw_domains.
>
>> +		} else {
>> +			g->type = TYPE_OAM;
>
>This is wrong, because num_perf_groups is 1. So the type should be assigned
>not based on i (which is always 0) but maybe similar to what is done in
>oa_init_regs above.

That's a bug. Will fix. It was working as expected for some other 
platforms that had OAM, but with PERF_GROUP_OAM_SAMEDIA_0, it broke.

Thanks,
Umesh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units
  2023-02-25  0:58     ` Umesh Nerlige Ramappa
@ 2023-02-25  3:58       ` Dixit, Ashutosh
  0 siblings, 0 replies; 33+ messages in thread
From: Dixit, Ashutosh @ 2023-02-25  3:58 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-gfx

On Fri, 24 Feb 2023 16:58:39 -0800, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

> On Thu, Feb 23, 2023 at 12:05:02PM -0800, Dixit, Ashutosh wrote:
> > On Thu, 16 Feb 2023 16:58:50 -0800, Umesh Nerlige Ramappa wrote:
> >>
> >
> > Hi Umesh,
> >
> >> MTL introduces additional OA units dedicated to media use cases. Add
> >> support for programming these OA units by passing the media engine class
> >> and instance parameters.
> >>
> >> UMD specific changes for GPUvis support:
> >> https://patchwork.freedesktop.org/patch/522827/?series=114023
> >> https://patchwork.freedesktop.org/patch/522822/?series=114023
> >> https://patchwork.freedesktop.org/patch/522826/?series=114023
> >> https://patchwork.freedesktop.org/patch/522828/?series=114023
> >> https://patchwork.freedesktop.org/patch/522816/?series=114023
> >> https://patchwork.freedesktop.org/patch/522825/?series=114023
> >
> > General comment about the patch in case I miss something out, as I've
> > mentioned previously in general let's try to replace INTEL_METEORLAKE and
> > IS_METEORLAKE checks in the patch with:
> >
> >	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> >
> > So that we don't have to enumerate each platform individually later.
>
> Hmm, I recall that you had already commented about this at some point,
> sorry I missed that. Do you suggest I add this change in places outside
> this patch as well?

Not needed unless you want to add another patch. At least in this patch
let's not start doing this in places where it did not exist previously.

> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_drv.h          |   2 +
> >>  drivers/gpu/drm/i915/i915_pci.c          |   1 +
> >>  drivers/gpu/drm/i915/i915_perf.c         | 247 ++++++++++++++++++++---
> >>  drivers/gpu/drm/i915/i915_perf_oa_regs.h |  78 +++++++
> >>  drivers/gpu/drm/i915/i915_perf_types.h   |  40 ++++
> >>  drivers/gpu/drm/i915/intel_device_info.h |   1 +
> >>  include/uapi/drm/i915_drm.h              |   4 +
> >>  7 files changed, 347 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >> index 0393273faa09..f3cacbf41c86 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -856,6 +856,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
> >>	(INTEL_INFO(dev_priv)->has_oa_bpc_reporting)
> >>  #define HAS_OA_SLICE_CONTRIB_LIMITS(dev_priv) \
> >>	(INTEL_INFO(dev_priv)->has_oa_slice_contrib_limits)
> >> +#define HAS_OAM(dev_priv) \
> >> +	(INTEL_INFO(dev_priv)->has_oam)
> >>
> >>  /*
> >>   * Set this flag, when platform requires 64K GTT page sizes or larger for
> >> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> >> index a8d942b16223..621730b6551c 100644
> >> --- a/drivers/gpu/drm/i915/i915_pci.c
> >> +++ b/drivers/gpu/drm/i915/i915_pci.c
> >> @@ -1028,6 +1028,7 @@ static const struct intel_device_info adl_p_info = {
> >>	.has_mslice_steering = 1, \
> >>	.has_oa_bpc_reporting = 1, \
> >>	.has_oa_slice_contrib_limits = 1, \
> >> +	.has_oam = 1, \
> >>	.has_rc6 = 1, \
> >>	.has_reset_engine = 1, \
> >>	.has_rps = 1, \
> >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> >> index f028df812067..a57690f4c531 100644
> >> --- a/drivers/gpu/drm/i915/i915_perf.c
> >> +++ b/drivers/gpu/drm/i915/i915_perf.c
> >> @@ -192,6 +192,7 @@
> >>   */
> >>
> >>  #include <linux/anon_inodes.h>
> >> +#include <linux/nospec.h>
> >>  #include <linux/sizes.h>
> >>  #include <linux/uuid.h>
> >>
> >> @@ -326,6 +327,13 @@ static const struct i915_oa_format oa_formats[I915_OA_FORMAT_MAX] = {
> >>	[I915_OA_FORMAT_A32u40_A4u32_B8_C8] = { 5, 256 },
> >>	[I915_OAR_FORMAT_A32u40_A4u32_B8_C8]    = { 5, 256 },
> >>	[I915_OA_FORMAT_A24u40_A14u32_B8_C8]    = { 5, 256 },
> >> +	[I915_OAM_FORMAT_MPEC8u64_B8_C8]	= { 1, 192, TYPE_OAM, HDR_64_BIT },
> >> +	[I915_OAM_FORMAT_MPEC8u32_B8_C8]	= { 2, 128, TYPE_OAM, HDR_64_BIT },
> >> +};
> >> +
> >> +/* PERF_GROUP_OAG is unused for oa_base, drop it for mtl */
> >
> > What does this comment mean?
>
> There are multiple OAM units and the base for each is used to calculate the
> OA regs mmio address. OAG is just one unit with the same addresses for the
> regs, so we don't use this array that initializes the bases for OA
> units. Maybe I will add this in the comment here.

You mean base is 0 in __oag_regs() correct? And index 0 corresponds to
PERF_GROUP_OAM_SAMEDIA_0 not to PERF_GROUP_OAG? Either drop the comment,
the code is clear enough I think, or make it clearer.

>
> >
> >> +static const u32 mtl_oa_base[] = {
> >> +	[PERF_GROUP_OAM_SAMEDIA_0] = 0x393000,
> >>  };
> >>
> >>  #define SAMPLE_OA_REPORT      (1<<0)
> >> @@ -418,11 +426,17 @@ static void free_oa_config_bo(struct i915_oa_config_bo *oa_bo)
> >>	kfree(oa_bo);
> >>  }
> >>
> >> +static inline const
> >> +struct i915_perf_regs *__oa_regs(struct i915_perf_stream *stream)
> >> +{
> >> +	return &stream->oa_buffer.group->regs;
> >
> > Should just use stream->engine->oa_group->regs, see near the bottom.
> >
> >> +}
> >> +
> >>  static u32 gen12_oa_hw_tail_read(struct i915_perf_stream *stream)
> >>  {
> >>	struct intel_uncore *uncore = stream->uncore;
> >>
> >> -	return intel_uncore_read(uncore, GEN12_OAG_OATAILPTR) &
> >> +	return intel_uncore_read(uncore, __oa_regs(stream)->oa_tail_ptr) &
> >>	       GEN12_OAG_OATAILPTR_MASK;
> >>  }
> >>
> >> @@ -886,7 +900,8 @@ static int gen8_append_oa_reports(struct i915_perf_stream *stream,
> >>		i915_reg_t oaheadptr;
> >>
> >>		oaheadptr = GRAPHICS_VER(stream->perf->i915) == 12 ?
> >
> >> = 12 ?
> >
> >> -			    GEN12_OAG_OAHEADPTR : GEN8_OAHEADPTR;
> >> +			    __oa_regs(stream)->oa_head_ptr :
> >> +			    GEN8_OAHEADPTR;
> >>
> >>		spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
> >>
> >> @@ -939,7 +954,8 @@ static int gen8_oa_read(struct i915_perf_stream *stream,
> >>		return -EIO;
> >>
> >>	oastatus_reg = GRAPHICS_VER(stream->perf->i915) == 12 ?
> >
> >> = 12 ?
> >
> >> -		       GEN12_OAG_OASTATUS : GEN8_OASTATUS;
> >> +		       __oa_regs(stream)->oa_status :
> >> +		       GEN8_OASTATUS;
> >>
> >>	oastatus = intel_uncore_read(uncore, oastatus_reg);
> >>
> >> @@ -1643,16 +1659,46 @@ free_noa_wait(struct i915_perf_stream *stream)
> >>	i915_vma_unpin_and_release(&stream->noa_wait, 0);
> >>  }
> >>
> >> +/*
> >> + * intel_engine_lookup_user ensures that most of engine specific checks are
> >> + * taken care of, however, we can run into a case where the OA unit catering to
> >> + * the engine passed by the user is disabled for some reason. In such cases,
> >> + * ensure oa unit corresponding to an engine is functional. If there are no
> >> + * engines in the group, the unit is disabled.
> >> + */
> >> +static bool oa_unit_functional(const struct intel_engine_cs *engine)
> >> +{
> >> +	return engine->oa_group && engine->oa_group->num_engines;
> >> +}
> >> +
> >>  static bool engine_supports_oa(const struct intel_engine_cs *engine)
> >>  {
> >>	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
> >>
> >>	switch (platform) {
> >> +	case INTEL_METEORLAKE:
> >> +		return engine->class == RENDER_CLASS ||
> >> +		       ((engine->class == VIDEO_DECODE_CLASS ||
> >> +			 engine->class == VIDEO_ENHANCEMENT_CLASS) &&
> >> +			engine->gt->type == GT_MEDIA);
> >>	default:
> >>		return engine->class == RENDER_CLASS;
> >>	}
> >
> > As mentioned in a previous patch, this could just be:
> >
> >	return engine->oa_group;
> >
> > Because all these checks have already been done when the perf groups were
> > initialized so let's use that, as is done for oa_unit_functional.
> >
> > Though, caution, to return engine->oa_group we'd have to remove
> > engine_supports_oa from __oa_engine_group, since engine->oa_group is not
> > yet assigned there. But I think the engine_supports_oa check in
> > __oa_engine_group is a duplication and should be removed.
> >
> >>  }
> >>
> >> +static bool engine_class_supports_oa_format(struct intel_engine_cs *engine, int type)
> >> +{
> >> +	switch (engine->class) {
> >> +	case RENDER_CLASS:
> >> +		return type == TYPE_OAG;
> >> +	case VIDEO_DECODE_CLASS:
> >> +	case VIDEO_ENHANCEMENT_CLASS:
> >> +		return type == TYPE_OAM;
> >> +	default:
> >> +		return false;
> >> +	}
> >> +}
> >> +
> >
> > Again, how about:
> >
> >	return engine->oa_group && engine->oa_group->type == type;
> >
> > Otherwise as mentioned below oa_group->type is unused and also incorrectly
> > assigned at present. The format type and group types are the same
> > (TYPE_OAG/TYPE_OAM). Can name the function engine_supports_oa_format.
> >
> >>  static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
> >>  {
> >>	struct i915_perf *perf = stream->perf;
> >> @@ -1680,7 +1726,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
> >>		drm_WARN_ON(&gt->i915->drm,
> >>			    intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc));
> >>
> >> -	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
> >> +	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
> >>	intel_engine_pm_put(stream->engine);
> >>
> >>	if (stream->ctx)
> >> @@ -1804,8 +1850,8 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
> >>
> >>	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
> >>
> >> -	intel_uncore_write(uncore, GEN12_OAG_OASTATUS, 0);
> >> -	intel_uncore_write(uncore, GEN12_OAG_OAHEADPTR,
> >> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_status, 0);
> >> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_head_ptr,
> >>			   gtt_offset & GEN12_OAG_OAHEADPTR_MASK);
> >>	stream->oa_buffer.head = gtt_offset;
> >>
> >> @@ -1817,9 +1863,9 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream)
> >>	 *  to enable proper functionality of the overflow
> >>	 *  bit."
> >>	 */
> >> -	intel_uncore_write(uncore, GEN12_OAG_OABUFFER, gtt_offset |
> >> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_buffer, gtt_offset |
> >>			   OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT);
> >> -	intel_uncore_write(uncore, GEN12_OAG_OATAILPTR,
> >> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_tail_ptr,
> >>			   gtt_offset & GEN12_OAG_OATAILPTR_MASK);
> >>
> >>	/* Mark that we need updated tail pointers to read from... */
> >> @@ -2579,7 +2625,8 @@ gen8_modify_self(struct intel_context *ce,
> >>	return err;
> >>  }
> >>
> >> -static int gen8_configure_context(struct i915_gem_context *ctx,
> >> +static int gen8_configure_context(struct i915_perf_stream *stream,
> >> +				  struct i915_gem_context *ctx,
> >>				  struct flex *flex, unsigned int count)
> >>  {
> >>	struct i915_gem_engines_iter it;
> >> @@ -2589,7 +2636,8 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
> >>	for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
> >>		GEM_BUG_ON(ce == ce->engine->kernel_context);
> >>
> >> -		if (!engine_supports_oa(ce->engine))
> >> +		if (!engine_supports_oa(ce->engine) ||
> >> +		    ce->engine->class != stream->engine->class)
> >>			continue;
> >>
> >>		/* Otherwise OA settings will be set upon first use */
> >> @@ -2720,7 +2768,7 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
> >>
> >>		spin_unlock(&i915->gem.contexts.lock);
> >>
> >> -		err = gen8_configure_context(ctx, regs, num_regs);
> >> +		err = gen8_configure_context(stream, ctx, regs, num_regs);
> >>		if (err) {
> >>			i915_gem_context_put(ctx);
> >>			return err;
> >> @@ -2740,7 +2788,8 @@ oa_configure_all_contexts(struct i915_perf_stream *stream,
> >>	for_each_uabi_engine(engine, i915) {
> >>		struct intel_context *ce = engine->kernel_context;
> >>
> >> -		if (!engine_supports_oa(ce->engine))
> >> +		if (!engine_supports_oa(ce->engine) ||
> >> +		    ce->engine->class != stream->engine->class)
> >>			continue;
> >>
> >>		regs[0].value = intel_sseu_make_rpcs(engine->gt, &ce->sseu);
> >> @@ -2765,6 +2814,9 @@ gen12_configure_all_contexts(struct i915_perf_stream *stream,
> >>		},
> >>	};
> >>
> >> +	if (stream->engine->class != RENDER_CLASS)
> >> +		return 0;
> >
> > OK, this is for render, nothing equivalent needed for media?
>
> Media engines decided not to have anything configured in the CS contexts,
> rather everything is saved/restored in power context transitions, so
> nothing to be done here.

Great!

>
> >
> >> +
> >>	return oa_configure_all_contexts(stream,
> >>					 regs, ARRAY_SIZE(regs),
> >>					 active);
> >> @@ -2894,7 +2946,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
> >>				   _MASKED_BIT_ENABLE(GEN12_DISABLE_DOP_GATING));
> >>	}
> >>
> >> -	intel_uncore_write(uncore, GEN12_OAG_OA_DEBUG,
> >> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_debug,
> >>			   /* Disable clk ratio reports, like previous Gens. */
> >>			   _MASKED_BIT_ENABLE(GEN12_OAG_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS |
> >>					      GEN12_OAG_OA_DEBUG_INCLUDE_CLK_RATIO) |
> >> @@ -2904,7 +2956,7 @@ gen12_enable_metric_set(struct i915_perf_stream *stream,
> >>			    */
> >>			   oag_report_ctx_switches(stream));
> >>
> >> -	intel_uncore_write(uncore, GEN12_OAG_OAGLBCTXCTRL, periodic ?
> >> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctx_ctrl, periodic ?
> >>			   (GEN12_OAG_OAGLBCTXCTRL_COUNTER_RESUME |
> >>			    GEN12_OAG_OAGLBCTXCTRL_TIMER_ENABLE |
> >>			    (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT))
> >> @@ -3058,8 +3110,8 @@ static void gen8_oa_enable(struct i915_perf_stream *stream)
> >>
> >>  static void gen12_oa_enable(struct i915_perf_stream *stream)
> >>  {
> >> -	struct intel_uncore *uncore = stream->uncore;
> >> -	u32 report_format = stream->oa_buffer.format->format;
> >> +	const struct i915_perf_regs *regs;
> >> +	u32 val;
> >>
> >>	/*
> >>	 * If we don't want OA reports from the OA buffer, then we don't even
> >> @@ -3070,9 +3122,11 @@ static void gen12_oa_enable(struct i915_perf_stream *stream)
> >>
> >>	gen12_init_oa_buffer(stream);
> >>
> >> -	intel_uncore_write(uncore, GEN12_OAG_OACONTROL,
> >> -			   (report_format << GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT) |
> >> -			   GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE);
> >> +	regs = __oa_regs(stream);
> >> +	val = (stream->oa_buffer.format->format << regs->oa_ctrl_counter_format_shift) |
> >> +	      GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE;
> >> +
> >> +	intel_uncore_write(stream->uncore, regs->oa_ctrl, val);
> >>  }
> >>
> >>  /**
> >> @@ -3124,9 +3178,9 @@ static void gen12_oa_disable(struct i915_perf_stream *stream)
> >>  {
> >>	struct intel_uncore *uncore = stream->uncore;
> >>
> >> -	intel_uncore_write(uncore, GEN12_OAG_OACONTROL, 0);
> >> +	intel_uncore_write(uncore, __oa_regs(stream)->oa_ctrl, 0);
> >>	if (intel_wait_for_register(uncore,
> >> -				    GEN12_OAG_OACONTROL,
> >> +				    __oa_regs(stream)->oa_ctrl,
> >>				    GEN12_OAG_OACONTROL_OA_COUNTER_ENABLE, 0,
> >>				    50))
> >>		drm_err(&stream->perf->i915->drm,
> >> @@ -3329,6 +3383,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
> >>
> >>	stream->sample_size = sizeof(struct drm_i915_perf_record_header);
> >>
> >> +	stream->oa_buffer.group = g;
> >
> > Should just use stream->engine->oa_group, see near the bottom.
> >
> >>	stream->oa_buffer.format = &perf->oa_formats[props->oa_format];
> >>	if (drm_WARN_ON(&i915->drm, stream->oa_buffer.format->size == 0))
> >>		return -EINVAL;
> >> @@ -3379,7 +3434,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
> >>	 *   references will effectively disable RC6.
> >>	 */
> >>	intel_engine_pm_get(stream->engine);
> >> -	intel_uncore_forcewake_get(stream->uncore, FORCEWAKE_ALL);
> >> +	intel_uncore_forcewake_get(stream->uncore, g->fw_domains);
> >>
> >>	/*
> >>	 * Wa_16011777198:dg2: GuC resets render as part of the Wa. This causes
> >> @@ -3440,7 +3495,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
> >>		intel_guc_slpc_unset_gucrc_mode(&gt->uc.guc.slpc);
> >>
> >>  err_gucrc:
> >> -	intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
> >> +	intel_uncore_forcewake_put(stream->uncore, g->fw_domains);
> >>	intel_engine_pm_put(stream->engine);
> >>
> >>	free_oa_configs(stream);
> >> @@ -4033,6 +4088,7 @@ static int read_properties_unlocked(struct i915_perf *perf,
> >>				    struct perf_open_properties *props)
> >>  {
> >>	struct drm_i915_gem_context_param_sseu user_sseu;
> >> +	const struct i915_oa_format *f;
> >>	u64 __user *uprop = uprops;
> >>	bool config_sseu = false;
> >>	u8 class, instance;
> >> @@ -4203,6 +4259,17 @@ static int read_properties_unlocked(struct i915_perf *perf,
> >>	if (!engine_supports_oa(props->engine))
> >>		return -EINVAL;
> >>
> >> +	if (!oa_unit_functional(props->engine))
> >> +		return -ENODEV;
> >> +
> >> +	i = array_index_nospec(props->oa_format, I915_OA_FORMAT_MAX);
> >
> > Why do we need this (something to do with speculation)? Can just do
> > '&perf->oa_formats[props->oa_format]' below? The format passed in has
> > already been checked in the switch statement above.
>
> Traced it back to "smatch cleanups" commit in rebase history. Something to
> do with static analysis. If not a major concern, I would leave it as is.

OK, though that internal commit doesn't show the warning for i915_perf.c :/

Though just found this online: https://lwn.net/Articles/752408/

>
> >
> >> +	f = &perf->oa_formats[i];
> >> +	if (!engine_class_supports_oa_format(props->engine, f->type)) {
> >> +		DRM_DEBUG("Invalid OA format %d for class %d\n",
> >> +			  f->type, props->engine->class);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >>	if (config_sseu) {
> >>		ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
> >>		if (ret) {
> >> @@ -4383,6 +4450,14 @@ static const struct i915_range gen12_oa_b_counters[] = {
> >>	{}
> >>  };
> >>
> >> +static const struct i915_range mtl_oam_b_counters[] = {
> >> +	{ .start = 0x393000, .end = 0x39301c },	/* GEN12_OAM_STARTTRIG1[1-8] */
> >> +	{ .start = 0x393020, .end = 0x39303c },	/* GEN12_OAM_REPORTTRIG1[1-8] */
> >> +	{ .start = 0x393040, .end = 0x39307c },	/* GEN12_OAM_CEC[0-7][0-1] */
> >> +	{ .start = 0x393200, .end = 0x39323C },	/* MPES[0-7] */
> >> +	{}
> >> +};
> >> +
> >>  static const struct i915_range xehp_oa_b_counters[] = {
> >>	{ .start = 0xdc48, .end = 0xdc48 },	/* OAA_ENABLE_REG */
> >>	{ .start = 0xdd00, .end = 0xdd48 },	/* OAG_LCE0_0 - OAA_LENABLE_REG */
> >> @@ -4429,13 +4504,16 @@ static const struct i915_range gen12_oa_mux_regs[] = {
> >>
> >>  /*
> >>   * Ref: 14010536224:
> >> - * 0x20cc is repurposed on MTL, so use a separate array for MTL.
> >> + * 0x20cc is repurposed on MTL, so use a separate array for MTL. Also add the
> >> + * MPES/MPEC registers.
> >
> > MPES/MPEC registers are added above now, not here so maybe get rid of the
> > comment change above?
> >
> >>   */
> >>  static const struct i915_range mtl_oa_mux_regs[] = {
> >>	{ .start = 0x0d00, .end = 0x0d04 },	/* RPM_CONFIG[0-1] */
> >>	{ .start = 0x0d0c, .end = 0x0d2c },	/* NOA_CONFIG[0-8] */
> >>	{ .start = 0x9840, .end = 0x9840 },	/* GDT_CHICKEN_BITS */
> >>	{ .start = 0x9884, .end = 0x9888 },	/* NOA_WRITE */
> >> +	{ .start = 0x38d100, .end = 0x38d114},	/* VISACTL */
> >> +	{}
> >>  };
> >>
> >>  static bool gen7_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
> >> @@ -4473,10 +4551,26 @@ static bool gen12_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
> >>	return reg_in_range_table(addr, gen12_oa_b_counters);
> >>  }
> >>
> >> +static bool xehp_is_valid_oam_b_counter_addr(struct i915_perf *perf, u32 addr)
> >> +{
> >> +	enum intel_platform platform = INTEL_INFO(perf->i915)->platform;
> >> +
> >> +	if (!HAS_OAM(perf->i915))
> >> +		return false;
> >> +
> >> +	switch (platform) {
> >> +	case INTEL_METEORLAKE:
> >> +		return reg_in_range_table(addr, mtl_oam_b_counters);
> >> +	default:
> >> +		return false;
> >> +	}
> >
> > Replace with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))', registers
> > are identical in later platforms too.
> >
> > Should the function prefix be xehp or mtl? Don't see xehp in bspec,
> > probably xehp is discontinued.
> >
> >> +}
> >> +
> >>  static bool xehp_is_valid_b_counter_addr(struct i915_perf *perf, u32 addr)
> >>  {
> >>	return reg_in_range_table(addr, xehp_oa_b_counters) ||
> >> -		reg_in_range_table(addr, gen12_oa_b_counters);
> >> +		reg_in_range_table(addr, gen12_oa_b_counters) ||
> >> +		xehp_is_valid_oam_b_counter_addr(perf, addr);
> >>  }
> >>
> >>  static bool gen12_is_valid_mux_addr(struct i915_perf *perf, u32 addr)
> >> @@ -4846,11 +4940,39 @@ static u32 __num_perf_groups_per_gt(struct intel_gt *gt)
> >>	enum intel_platform platform = INTEL_INFO(gt->i915)->platform;
> >>
> >>	switch (platform) {
> >> +	case INTEL_METEORLAKE:
> >> +		return 1;
> >
> > I don't think we need this, as proposed previously maybe the function
> > should just unconditionally return 1.
> >
> >>	default:
> >>		return 1;
> >>	}
> >>  }
> >>
> >> +static u32 __oam_engine_group(struct intel_engine_cs *engine)
> >> +{
> >> +	enum intel_platform platform = INTEL_INFO(engine->i915)->platform;
> >> +	struct intel_gt *gt = engine->gt;
> >> +	u32 group = PERF_GROUP_INVALID;
> >> +
> >> +	switch (platform) {
> >> +	case INTEL_METEORLAKE:
> >
> > Replace here with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))'.
> >
> >> +		/*
> >> +		 * There's 1 SAMEDIA gt and 1 OAM per SAMEDIA gt. All media slices
> >> +		 * within the gt use the same OAM. All MTL SKUs list 1 SA MEDIA.
> >> +		 */
> >> +		drm_WARN_ON(&engine->i915->drm,
> >> +			    engine->gt->type != GT_MEDIA);
> >> +
> >> +		group = PERF_GROUP_OAM_SAMEDIA_0;
> >> +		break;
> >> +	default:
> >> +		break;
> >> +	}
> >> +
> >> +	drm_WARN_ON(&gt->i915->drm, group >= __num_perf_groups_per_gt(gt));
> >> +
> >> +	return group;
> >> +}
> >> +
> >>  static u32 __oa_engine_group(struct intel_engine_cs *engine)
> >>  {
> >>	if (!engine_supports_oa(engine))
> >
> > As mentioned above for engine_supports_oa, this looks like a duplication of
> > the checks below and should probably be removed.
>
> I recall I added this because on older platforms, we were seeing a warnON
> splat from __oam_engine_group because those platforms had a media
> engine. Ideally, I would implement all your inputs in this patch and drop
> the warnON from __oam_engine_group. Is that okay?

OK you are referring to this warn_on:

	drm_WARN_ON(&gt->i915->drm, group >= __num_perf_groups_per_gt(gt));

Since we are supporting OAM for >= 12.70 we can put the warn_on under the
>= 12.70 check, but if it looks weird because we have just set the group to
PERF_GROUP_OAM_SAMEDIA_0, we could just get rid of the warn_on.

>
> >
> >> @@ -4860,11 +4982,58 @@ static u32 __oa_engine_group(struct intel_engine_cs *engine)
> >>	case RENDER_CLASS:
> >>		return PERF_GROUP_OAG;
> >>
> >> +	case VIDEO_DECODE_CLASS:
> >> +	case VIDEO_ENHANCEMENT_CLASS:
> >> +		return __oam_engine_group(engine);
> >> +
> >>	default:
> >>		return PERF_GROUP_INVALID;
> >>	}
> >>  }
> >>
> >> +static struct i915_perf_regs __oam_regs(u32 base)
> >> +{
> >> +	return (struct i915_perf_regs) {
> >> +		base,
> >> +		GEN12_OAM_HEAD_POINTER(base),
> >> +		GEN12_OAM_TAIL_POINTER(base),
> >> +		GEN12_OAM_BUFFER(base),
> >> +		GEN12_OAM_CONTEXT_CONTROL(base),
> >> +		GEN12_OAM_CONTROL(base),
> >> +		GEN12_OAM_DEBUG(base),
> >> +		GEN12_OAM_STATUS(base),
> >> +		GEN12_OAM_CONTROL_COUNTER_FORMAT_SHIFT,
> >> +	};
> >> +}
> >> +
> >> +static struct i915_perf_regs __oag_regs(void)
> >> +{
> >> +	return (struct i915_perf_regs) {
> >> +		0,
> >> +		GEN12_OAG_OAHEADPTR,
> >> +		GEN12_OAG_OATAILPTR,
> >> +		GEN12_OAG_OABUFFER,
> >> +		GEN12_OAG_OAGLBCTXCTRL,
> >> +		GEN12_OAG_OACONTROL,
> >> +		GEN12_OAG_OA_DEBUG,
> >> +		GEN12_OAG_OASTATUS,
> >> +		GEN12_OAG_OACONTROL_OA_COUNTER_FORMAT_SHIFT,
> >> +	};
> >> +}
> >> +
> >> +static void oa_init_regs(struct intel_gt *gt, u32 id)
> >> +{
> >> +	struct i915_perf_group *group = &gt->perf.group[id];
> >> +	struct i915_perf_regs *regs = &group->regs;
> >> +
> >> +	if (id == PERF_GROUP_OAG && gt->type != GT_MEDIA)
> >> +		*regs = __oag_regs();
> >> +	else if (IS_METEORLAKE(gt->i915))
> >
> > Replace with 'if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))', OAM
> > registers are identical in later platforms too. Maybe get rid of drm_WARN
> > below?
> >
> >> +		*regs = __oam_regs(mtl_oa_base[id]);
> >> +	else
> >> +		drm_WARN(&gt->i915->drm, 1, "Unsupported platform for OA\n");
> >> +}
> >> +
> >>  static void oa_init_groups(struct intel_gt *gt)
> >>  {
> >>	int i, num_groups = gt->perf.num_perf_groups;
> >> @@ -4881,6 +5050,24 @@ static void oa_init_groups(struct intel_gt *gt)
> >>		g->oa_unit_id = perf->oa_unit_ids++;
> >>
> >>		g->gt = gt;
> >> +		oa_init_regs(gt, i);
> >> +		g->fw_domains = FORCEWAKE_ALL;
> >> +		if (i == PERF_GROUP_OAG) {
> >> +			g->type = TYPE_OAG;
> >> +
> >> +			/*
> >> +			 * Enabling all fw domains for OAG caps the max GT
> >> +			 * frequency to media FF max. This could be less than
> >> +			 * what the user sets through the sysfs and perf
> >> +			 * measurements could be skewed. Since some platforms
> >> +			 * have separate OAM units to measure media perf, do not
> >> +			 * enable media fw domains for OAG.
> >> +			 */
> >> +			if (HAS_OAM(gt->i915))
> >> +				g->fw_domains = FORCEWAKE_GT | FORCEWAKE_RENDER;
> >
> > Is this needed even when media and render are separate tiles, which is the
> > only case we have in this code right now? For separate tiles setting
> > FORCEWAKE_ALL should not cap the freq, correct?
> >
> > If not needed we can get rid of g->fw_domains.
> >
> >> +		} else {
> >> +			g->type = TYPE_OAM;
> >
> > This is wrong, because num_perf_groups is 1. So the type should be assigned
> > not based on i (which is always 0) but maybe similar to what is done in
> > oa_init_regs above.
>
> That's a bug. Will fix. It was working as expected for some other platforms
> that had OAM, but with PERF_GROUP_OAM_SAMEDIA_0, it broke.

Thanks.
--
Ashutosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2023-02-25  3:58 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-17  0:58 [Intel-gfx] [PATCH v2 0/9] Add OAM support for MTL Umesh Nerlige Ramappa
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 1/9] drm/i915/perf: Drop wakeref on GuC RC error Umesh Nerlige Ramappa
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 2/9] drm/i915/perf: Add helper to check supported OA engines Umesh Nerlige Ramappa
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 3/9] drm/i915/perf: Validate OA sseu config outside switch Umesh Nerlige Ramappa
2023-02-17  1:10   ` Dixit, Ashutosh
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 4/9] drm/i915/perf: Group engines into respective OA groups Umesh Nerlige Ramappa
2023-02-22 21:52   ` Dixit, Ashutosh
2023-02-24 17:30     ` Umesh Nerlige Ramappa
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 5/9] drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM Umesh Nerlige Ramappa
2023-02-17  2:04   ` Dixit, Ashutosh
2023-02-17  9:55     ` Jani Nikula
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 6/9] drm/i915/perf: Parse 64bit report header formats correctly Umesh Nerlige Ramappa
2023-02-21 22:14   ` Dixit, Ashutosh
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 7/9] drm/i915/perf: Handle non-power-of-2 reports Umesh Nerlige Ramappa
2023-02-17 20:58   ` Dixit, Ashutosh
2023-02-18  0:05     ` Umesh Nerlige Ramappa
2023-02-18  1:57       ` Dixit, Ashutosh
2023-02-21 18:51         ` Dixit, Ashutosh
2023-02-24 19:12           ` Umesh Nerlige Ramappa
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 8/9] drm/i915/perf: Add engine class instance parameters to perf Umesh Nerlige Ramappa
2023-02-17 23:37   ` Umesh Nerlige Ramappa
2023-02-21 23:53   ` Dixit, Ashutosh
2023-02-22  0:10     ` Dixit, Ashutosh
2023-02-24 19:37     ` Umesh Nerlige Ramappa
2023-02-24 20:48       ` Dixit, Ashutosh
2023-02-17  0:58 ` [Intel-gfx] [PATCH v2 9/9] drm/i915/perf: Add support for OA media units Umesh Nerlige Ramappa
2023-02-17 23:37   ` Umesh Nerlige Ramappa
2023-02-23 20:05   ` Dixit, Ashutosh
2023-02-25  0:58     ` Umesh Nerlige Ramappa
2023-02-25  3:58       ` Dixit, Ashutosh
2023-02-17  1:35 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Add OAM support for MTL (rev2) Patchwork
2023-02-17  1:55 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-02-17 16:09 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.