* [PATCH 1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
@ 2019-11-11 22:09 ` Umesh Nerlige Ramappa
0 siblings, 0 replies; 12+ messages in thread
From: Umesh Nerlige Ramappa @ 2019-11-11 22:09 UTC (permalink / raw)
To: Lionel G Landwerlin, Chris Wilson, intel-gfx
SAMPLE_OA_REPORT enables sampling of OA reports from the OA buffer.
Since reports from OA buffer had system wide visibility, collecting
samples from the OA buffer was a privileged operation on previous
platforms. Prior to TGL, it was also necessary to sample the OA buffer
to normalize reports from MI REPORT PERF COUNT.
TGL has a dedicated OAR unit to sample perf reports for a specific
render context. This removes the necessity to sample OA buffer.
- If not sampling the OA buffer, allow non-privileged access. An earlier
patch allows the non-privilege access:
https://patchwork.freedesktop.org/patch/337716/?series=68582&rev=1
- Clear up the path for non-privileged access in this patch
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
drivers/gpu/drm/i915/i915_perf.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 507236bd41ae..b922000e4b9b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -2713,7 +2713,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
return -EINVAL;
}
- if (!(props->sample_flags & SAMPLE_OA_REPORT)) {
+ if (!(props->sample_flags & SAMPLE_OA_REPORT) &&
+ (INTEL_GEN(perf->i915) < 12 || !stream->ctx)) {
DRM_DEBUG("Only OA report sampling supported\n");
return -EINVAL;
}
@@ -2745,7 +2746,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
format_size = perf->oa_formats[props->oa_format].size;
- stream->sample_flags |= SAMPLE_OA_REPORT;
+ stream->sample_flags = props->sample_flags;
stream->sample_size += format_size;
stream->oa_buffer.format_size = format_size;
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Intel-gfx] [PATCH 1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
@ 2019-11-11 22:09 ` Umesh Nerlige Ramappa
0 siblings, 0 replies; 12+ messages in thread
From: Umesh Nerlige Ramappa @ 2019-11-11 22:09 UTC (permalink / raw)
To: Lionel G Landwerlin, Chris Wilson, intel-gfx
SAMPLE_OA_REPORT enables sampling of OA reports from the OA buffer.
Since reports from OA buffer had system wide visibility, collecting
samples from the OA buffer was a privileged operation on previous
platforms. Prior to TGL, it was also necessary to sample the OA buffer
to normalize reports from MI REPORT PERF COUNT.
TGL has a dedicated OAR unit to sample perf reports for a specific
render context. This removes the necessity to sample OA buffer.
- If not sampling the OA buffer, allow non-privileged access. An earlier
patch allows the non-privilege access:
https://patchwork.freedesktop.org/patch/337716/?series=68582&rev=1
- Clear up the path for non-privileged access in this patch
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
drivers/gpu/drm/i915/i915_perf.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 507236bd41ae..b922000e4b9b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -2713,7 +2713,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
return -EINVAL;
}
- if (!(props->sample_flags & SAMPLE_OA_REPORT)) {
+ if (!(props->sample_flags & SAMPLE_OA_REPORT) &&
+ (INTEL_GEN(perf->i915) < 12 || !stream->ctx)) {
DRM_DEBUG("Only OA report sampling supported\n");
return -EINVAL;
}
@@ -2745,7 +2746,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
format_size = perf->oa_formats[props->oa_format].size;
- stream->sample_flags |= SAMPLE_OA_REPORT;
+ stream->sample_flags = props->sample_flags;
stream->sample_size += format_size;
stream->oa_buffer.format_size = format_size;
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] drm/i915/perf: Configure OAR for specific context
@ 2019-11-11 22:09 ` Umesh Nerlige Ramappa
0 siblings, 0 replies; 12+ messages in thread
From: Umesh Nerlige Ramappa @ 2019-11-11 22:09 UTC (permalink / raw)
To: Lionel G Landwerlin, Chris Wilson, intel-gfx
Gen12 supports saving/restoring render counters per context. Apply OAR
configuration only for the context that is passed in to perf.
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
drivers/gpu/drm/i915/i915_perf.c | 203 ++++++++++++++++++-------------
1 file changed, 118 insertions(+), 85 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index b922000e4b9b..63633e73a695 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -2047,6 +2047,18 @@ static u32 oa_config_flex_reg(const struct i915_oa_config *oa_config,
return 0;
}
+
+static void
+gen12_update_reg_state_unlocked(const struct intel_context *ce,
+ const struct i915_perf_stream *stream)
+{
+ u32 *reg_state = ce->lrc_reg_state;
+
+ /* Use a stable power state configuration */
+ reg_state[CTX_R_PWR_CLK_STATE] =
+ intel_sseu_make_rpcs(ce->engine->i915, &ce->sseu);
+}
+
/*
* NB: It must always remain pointer safe to run this even if the OA unit
* has been disabled.
@@ -2073,20 +2085,12 @@ gen8_update_reg_state_unlocked(const struct intel_context *ce,
u32 *reg_state = ce->lrc_reg_state;
int i;
- if (IS_GEN(stream->perf->i915, 12)) {
- u32 format = stream->oa_buffer.format;
+ reg_state[ctx_oactxctrl + 1] =
+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
+ GEN8_OA_COUNTER_RESUME;
- reg_state[ctx_oactxctrl + 1] =
- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
- (stream->oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- } else {
- reg_state[ctx_oactxctrl + 1] =
- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
- GEN8_OA_COUNTER_RESUME;
- }
-
- for (i = 0; !!ctx_flexeu0 && i < ARRAY_SIZE(flex_regs); i++)
+ for (i = 0; i < ARRAY_SIZE(flex_regs); i++)
reg_state[ctx_flexeu0 + i * 2 + 1] =
oa_config_flex_reg(stream->oa_config, flex_regs[i]);
@@ -2219,34 +2223,49 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
return err;
}
-static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
+static int gen12_configure_oar_context(struct i915_perf_stream *stream, bool enable)
{
- struct i915_request *rq;
- u32 *cs;
- int err = 0;
-
- rq = i915_request_create(ce);
- if (IS_ERR(rq))
- return PTR_ERR(rq);
+ int err;
+ struct intel_context *ce = stream->pinned_ctx;
+ struct flex regs[] = {
+ {
+ GEN12_OAR_OACONTROL,
+ stream->perf->ctx_oactxctrl_offset + 1,
+ },
+ {
+ RING_CONTEXT_CONTROL(ce->engine->mmio_base),
+ CTX_CONTEXT_CONTROL,
+ },
+ };
+ u32 format = stream->oa_buffer.format;
- cs = intel_ring_begin(rq, 4);
- if (IS_ERR(cs)) {
- err = PTR_ERR(cs);
- goto out;
- }
+ regs[0].value =
+ (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
+ (enable ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- *cs++ = MI_LOAD_REGISTER_IMM(1);
- *cs++ = i915_mmio_reg_offset(RING_CONTEXT_CONTROL(ce->engine->mmio_base));
- *cs++ = _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
- enable ? GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE : 0);
- *cs++ = MI_NOOP;
+ /* This value is only good for LRI and not for the context image. */
+ regs[1].value = _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
+ enable ?
+ GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE :
+ 0);
- intel_ring_advance(rq, cs);
+ err = intel_context_lock_pinned(ce);
+ if (err)
+ return err;
-out:
- i915_request_add(rq);
+ /* Modify the context image of pinned context.
+ *
+ * We will not modify the CTX CONTEXT CONTROL here as an LRI is
+ * sufficient. OAR_OACONTROL needs to be modified in the context
+ * image as well as with an LRI.
+ */
+ err = gen8_modify_context(ce, regs, ARRAY_SIZE(regs) - 1);
+ intel_context_unlock_pinned(ce);
+ if (err)
+ return err;
- return err;
+ /* Use LRI to modify the MMIOs using pinned context */
+ return gen8_modify_self(ce, regs, ARRAY_SIZE(regs));
}
/*
@@ -2272,53 +2291,16 @@ static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
* per-context OA state.
*
* Note: it's only the RCS/Render context that has any OA state.
+ * Note: the first flex register passed must always be R_PWR_CLK_STATE
*/
-static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
- const struct i915_oa_config *oa_config)
+static int oa_configure_all_contexts(struct i915_perf_stream *stream,
+ struct flex *regs,
+ size_t num_regs)
{
struct drm_i915_private *i915 = stream->perf->i915;
- /* The MMIO offsets for Flex EU registers aren't contiguous */
- const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
-#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
- struct flex regs[] = {
- {
- GEN8_R_PWR_CLK_STATE,
- CTX_R_PWR_CLK_STATE,
- },
- {
- IS_GEN(i915, 12) ?
- GEN12_OAR_OACONTROL : GEN8_OACTXCONTROL,
- stream->perf->ctx_oactxctrl_offset + 1,
- },
- { EU_PERF_CNTL0, ctx_flexeuN(0) },
- { EU_PERF_CNTL1, ctx_flexeuN(1) },
- { EU_PERF_CNTL2, ctx_flexeuN(2) },
- { EU_PERF_CNTL3, ctx_flexeuN(3) },
- { EU_PERF_CNTL4, ctx_flexeuN(4) },
- { EU_PERF_CNTL5, ctx_flexeuN(5) },
- { EU_PERF_CNTL6, ctx_flexeuN(6) },
- };
-#undef ctx_flexeuN
struct intel_engine_cs *engine;
struct i915_gem_context *ctx, *cn;
- size_t array_size = IS_GEN(i915, 12) ? 2 : ARRAY_SIZE(regs);
- int i, err;
-
- if (IS_GEN(i915, 12)) {
- u32 format = stream->oa_buffer.format;
-
- regs[1].value =
- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
- (oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- } else {
- regs[1].value =
- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
- GEN8_OA_COUNTER_RESUME;
- }
-
- for (i = 2; !!ctx_flexeu0 && i < array_size; i++)
- regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
+ int err;
lockdep_assert_held(&stream->perf->lock);
@@ -2348,7 +2330,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
spin_unlock(&i915->gem.contexts.lock);
- err = gen8_configure_context(ctx, regs, array_size);
+ err = gen8_configure_context(ctx, regs, num_regs);
if (err) {
i915_gem_context_put(ctx);
return err;
@@ -2373,7 +2355,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
regs[0].value = intel_sseu_make_rpcs(i915, &ce->sseu);
- err = gen8_modify_self(ce, regs, array_size);
+ err = gen8_modify_self(ce, regs, num_regs);
if (err)
return err;
}
@@ -2381,6 +2363,56 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
return 0;
}
+static int gen12_configure_all_contexts(struct i915_perf_stream *stream,
+ const struct i915_oa_config *oa_config)
+{
+ struct flex regs[] = {
+ {
+ GEN8_R_PWR_CLK_STATE,
+ CTX_R_PWR_CLK_STATE,
+ },
+ };
+
+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
+}
+
+static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
+ const struct i915_oa_config *oa_config)
+{
+ /* The MMIO offsets for Flex EU registers aren't contiguous */
+ const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
+#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
+ struct flex regs[] = {
+ {
+ GEN8_R_PWR_CLK_STATE,
+ CTX_R_PWR_CLK_STATE,
+ },
+ {
+ GEN8_OACTXCONTROL,
+ stream->perf->ctx_oactxctrl_offset + 1,
+ },
+ { EU_PERF_CNTL0, ctx_flexeuN(0) },
+ { EU_PERF_CNTL1, ctx_flexeuN(1) },
+ { EU_PERF_CNTL2, ctx_flexeuN(2) },
+ { EU_PERF_CNTL3, ctx_flexeuN(3) },
+ { EU_PERF_CNTL4, ctx_flexeuN(4) },
+ { EU_PERF_CNTL5, ctx_flexeuN(5) },
+ { EU_PERF_CNTL6, ctx_flexeuN(6) },
+ };
+#undef ctx_flexeuN
+ int i;
+
+ regs[1].value =
+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
+ GEN8_OA_COUNTER_RESUME;
+
+ for (i = 2; i < ARRAY_SIZE(regs); i++)
+ regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
+
+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
+}
+
static int gen8_enable_metric_set(struct i915_perf_stream *stream)
{
struct intel_uncore *uncore = stream->uncore;
@@ -2464,7 +2496,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
* to make sure all slices/subslices are ON before writing to NOA
* registers.
*/
- ret = lrc_configure_all_contexts(stream, oa_config);
+ ret = gen12_configure_all_contexts(stream, oa_config);
if (ret)
return ret;
@@ -2474,8 +2506,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
* requested this.
*/
if (stream->ctx) {
- ret = gen12_emit_oar_config(stream->pinned_ctx,
- oa_config != NULL);
+ ret = gen12_configure_oar_context(stream, oa_config != NULL);
if (ret)
return ret;
}
@@ -2509,11 +2540,11 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream)
struct intel_uncore *uncore = stream->uncore;
/* Reset all contexts' slices/subslices configurations. */
- lrc_configure_all_contexts(stream, NULL);
+ gen12_configure_all_contexts(stream, NULL);
/* disable the context save/restore or OAR counters */
if (stream->ctx)
- gen12_emit_oar_config(stream->pinned_ctx, false);
+ gen12_configure_oar_context(stream, false);
/* Make sure we disable noa to save power. */
intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0);
@@ -2856,7 +2887,9 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
stream = engine->i915->perf.exclusive_stream;
if (stream)
- gen8_update_reg_state_unlocked(ce, stream);
+ IS_GEN(stream->perf->i915, 12) ?
+ gen12_update_reg_state_unlocked(ce, stream) :
+ gen8_update_reg_state_unlocked(ce, stream);
}
/**
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Intel-gfx] [PATCH 2/2] drm/i915/perf: Configure OAR for specific context
@ 2019-11-11 22:09 ` Umesh Nerlige Ramappa
0 siblings, 0 replies; 12+ messages in thread
From: Umesh Nerlige Ramappa @ 2019-11-11 22:09 UTC (permalink / raw)
To: Lionel G Landwerlin, Chris Wilson, intel-gfx
Gen12 supports saving/restoring render counters per context. Apply OAR
configuration only for the context that is passed in to perf.
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
drivers/gpu/drm/i915/i915_perf.c | 203 ++++++++++++++++++-------------
1 file changed, 118 insertions(+), 85 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index b922000e4b9b..63633e73a695 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -2047,6 +2047,18 @@ static u32 oa_config_flex_reg(const struct i915_oa_config *oa_config,
return 0;
}
+
+static void
+gen12_update_reg_state_unlocked(const struct intel_context *ce,
+ const struct i915_perf_stream *stream)
+{
+ u32 *reg_state = ce->lrc_reg_state;
+
+ /* Use a stable power state configuration */
+ reg_state[CTX_R_PWR_CLK_STATE] =
+ intel_sseu_make_rpcs(ce->engine->i915, &ce->sseu);
+}
+
/*
* NB: It must always remain pointer safe to run this even if the OA unit
* has been disabled.
@@ -2073,20 +2085,12 @@ gen8_update_reg_state_unlocked(const struct intel_context *ce,
u32 *reg_state = ce->lrc_reg_state;
int i;
- if (IS_GEN(stream->perf->i915, 12)) {
- u32 format = stream->oa_buffer.format;
+ reg_state[ctx_oactxctrl + 1] =
+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
+ GEN8_OA_COUNTER_RESUME;
- reg_state[ctx_oactxctrl + 1] =
- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
- (stream->oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- } else {
- reg_state[ctx_oactxctrl + 1] =
- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
- GEN8_OA_COUNTER_RESUME;
- }
-
- for (i = 0; !!ctx_flexeu0 && i < ARRAY_SIZE(flex_regs); i++)
+ for (i = 0; i < ARRAY_SIZE(flex_regs); i++)
reg_state[ctx_flexeu0 + i * 2 + 1] =
oa_config_flex_reg(stream->oa_config, flex_regs[i]);
@@ -2219,34 +2223,49 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
return err;
}
-static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
+static int gen12_configure_oar_context(struct i915_perf_stream *stream, bool enable)
{
- struct i915_request *rq;
- u32 *cs;
- int err = 0;
-
- rq = i915_request_create(ce);
- if (IS_ERR(rq))
- return PTR_ERR(rq);
+ int err;
+ struct intel_context *ce = stream->pinned_ctx;
+ struct flex regs[] = {
+ {
+ GEN12_OAR_OACONTROL,
+ stream->perf->ctx_oactxctrl_offset + 1,
+ },
+ {
+ RING_CONTEXT_CONTROL(ce->engine->mmio_base),
+ CTX_CONTEXT_CONTROL,
+ },
+ };
+ u32 format = stream->oa_buffer.format;
- cs = intel_ring_begin(rq, 4);
- if (IS_ERR(cs)) {
- err = PTR_ERR(cs);
- goto out;
- }
+ regs[0].value =
+ (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
+ (enable ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- *cs++ = MI_LOAD_REGISTER_IMM(1);
- *cs++ = i915_mmio_reg_offset(RING_CONTEXT_CONTROL(ce->engine->mmio_base));
- *cs++ = _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
- enable ? GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE : 0);
- *cs++ = MI_NOOP;
+ /* This value is only good for LRI and not for the context image. */
+ regs[1].value = _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
+ enable ?
+ GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE :
+ 0);
- intel_ring_advance(rq, cs);
+ err = intel_context_lock_pinned(ce);
+ if (err)
+ return err;
-out:
- i915_request_add(rq);
+ /* Modify the context image of pinned context.
+ *
+ * We will not modify the CTX CONTEXT CONTROL here as an LRI is
+ * sufficient. OAR_OACONTROL needs to be modified in the context
+ * image as well as with an LRI.
+ */
+ err = gen8_modify_context(ce, regs, ARRAY_SIZE(regs) - 1);
+ intel_context_unlock_pinned(ce);
+ if (err)
+ return err;
- return err;
+ /* Use LRI to modify the MMIOs using pinned context */
+ return gen8_modify_self(ce, regs, ARRAY_SIZE(regs));
}
/*
@@ -2272,53 +2291,16 @@ static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
* per-context OA state.
*
* Note: it's only the RCS/Render context that has any OA state.
+ * Note: the first flex register passed must always be R_PWR_CLK_STATE
*/
-static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
- const struct i915_oa_config *oa_config)
+static int oa_configure_all_contexts(struct i915_perf_stream *stream,
+ struct flex *regs,
+ size_t num_regs)
{
struct drm_i915_private *i915 = stream->perf->i915;
- /* The MMIO offsets for Flex EU registers aren't contiguous */
- const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
-#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
- struct flex regs[] = {
- {
- GEN8_R_PWR_CLK_STATE,
- CTX_R_PWR_CLK_STATE,
- },
- {
- IS_GEN(i915, 12) ?
- GEN12_OAR_OACONTROL : GEN8_OACTXCONTROL,
- stream->perf->ctx_oactxctrl_offset + 1,
- },
- { EU_PERF_CNTL0, ctx_flexeuN(0) },
- { EU_PERF_CNTL1, ctx_flexeuN(1) },
- { EU_PERF_CNTL2, ctx_flexeuN(2) },
- { EU_PERF_CNTL3, ctx_flexeuN(3) },
- { EU_PERF_CNTL4, ctx_flexeuN(4) },
- { EU_PERF_CNTL5, ctx_flexeuN(5) },
- { EU_PERF_CNTL6, ctx_flexeuN(6) },
- };
-#undef ctx_flexeuN
struct intel_engine_cs *engine;
struct i915_gem_context *ctx, *cn;
- size_t array_size = IS_GEN(i915, 12) ? 2 : ARRAY_SIZE(regs);
- int i, err;
-
- if (IS_GEN(i915, 12)) {
- u32 format = stream->oa_buffer.format;
-
- regs[1].value =
- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
- (oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- } else {
- regs[1].value =
- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
- GEN8_OA_COUNTER_RESUME;
- }
-
- for (i = 2; !!ctx_flexeu0 && i < array_size; i++)
- regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
+ int err;
lockdep_assert_held(&stream->perf->lock);
@@ -2348,7 +2330,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
spin_unlock(&i915->gem.contexts.lock);
- err = gen8_configure_context(ctx, regs, array_size);
+ err = gen8_configure_context(ctx, regs, num_regs);
if (err) {
i915_gem_context_put(ctx);
return err;
@@ -2373,7 +2355,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
regs[0].value = intel_sseu_make_rpcs(i915, &ce->sseu);
- err = gen8_modify_self(ce, regs, array_size);
+ err = gen8_modify_self(ce, regs, num_regs);
if (err)
return err;
}
@@ -2381,6 +2363,56 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
return 0;
}
+static int gen12_configure_all_contexts(struct i915_perf_stream *stream,
+ const struct i915_oa_config *oa_config)
+{
+ struct flex regs[] = {
+ {
+ GEN8_R_PWR_CLK_STATE,
+ CTX_R_PWR_CLK_STATE,
+ },
+ };
+
+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
+}
+
+static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
+ const struct i915_oa_config *oa_config)
+{
+ /* The MMIO offsets for Flex EU registers aren't contiguous */
+ const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
+#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
+ struct flex regs[] = {
+ {
+ GEN8_R_PWR_CLK_STATE,
+ CTX_R_PWR_CLK_STATE,
+ },
+ {
+ GEN8_OACTXCONTROL,
+ stream->perf->ctx_oactxctrl_offset + 1,
+ },
+ { EU_PERF_CNTL0, ctx_flexeuN(0) },
+ { EU_PERF_CNTL1, ctx_flexeuN(1) },
+ { EU_PERF_CNTL2, ctx_flexeuN(2) },
+ { EU_PERF_CNTL3, ctx_flexeuN(3) },
+ { EU_PERF_CNTL4, ctx_flexeuN(4) },
+ { EU_PERF_CNTL5, ctx_flexeuN(5) },
+ { EU_PERF_CNTL6, ctx_flexeuN(6) },
+ };
+#undef ctx_flexeuN
+ int i;
+
+ regs[1].value =
+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
+ GEN8_OA_COUNTER_RESUME;
+
+ for (i = 2; i < ARRAY_SIZE(regs); i++)
+ regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
+
+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
+}
+
static int gen8_enable_metric_set(struct i915_perf_stream *stream)
{
struct intel_uncore *uncore = stream->uncore;
@@ -2464,7 +2496,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
* to make sure all slices/subslices are ON before writing to NOA
* registers.
*/
- ret = lrc_configure_all_contexts(stream, oa_config);
+ ret = gen12_configure_all_contexts(stream, oa_config);
if (ret)
return ret;
@@ -2474,8 +2506,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
* requested this.
*/
if (stream->ctx) {
- ret = gen12_emit_oar_config(stream->pinned_ctx,
- oa_config != NULL);
+ ret = gen12_configure_oar_context(stream, oa_config != NULL);
if (ret)
return ret;
}
@@ -2509,11 +2540,11 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream)
struct intel_uncore *uncore = stream->uncore;
/* Reset all contexts' slices/subslices configurations. */
- lrc_configure_all_contexts(stream, NULL);
+ gen12_configure_all_contexts(stream, NULL);
/* disable the context save/restore or OAR counters */
if (stream->ctx)
- gen12_emit_oar_config(stream->pinned_ctx, false);
+ gen12_configure_oar_context(stream, false);
/* Make sure we disable noa to save power. */
intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0);
@@ -2856,7 +2887,9 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
stream = engine->i915->perf.exclusive_stream;
if (stream)
- gen8_update_reg_state_unlocked(ce, stream);
+ IS_GEN(stream->perf->i915, 12) ?
+ gen12_update_reg_state_unlocked(ce, stream) :
+ gen8_update_reg_state_unlocked(ce, stream);
}
/**
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
@ 2019-11-11 23:32 ` Patchwork
0 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2019-11-11 23:32 UTC (permalink / raw)
To: Umesh Nerlige Ramappa; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
URL : https://patchwork.freedesktop.org/series/69321/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
3b64a508eefb drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
c207c66a6167 drm/i915/perf: Configure OAR for specific context
-:281: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "oa_config"
#281: FILE: drivers/gpu/drm/i915/i915_perf.c:2509:
+ ret = gen12_configure_oar_context(stream, oa_config != NULL);
total: 0 errors, 0 warnings, 1 checks, 284 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
@ 2019-11-11 23:32 ` Patchwork
0 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2019-11-11 23:32 UTC (permalink / raw)
To: Umesh Nerlige Ramappa; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
URL : https://patchwork.freedesktop.org/series/69321/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
3b64a508eefb drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
c207c66a6167 drm/i915/perf: Configure OAR for specific context
-:281: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "oa_config"
#281: FILE: drivers/gpu/drm/i915/i915_perf.c:2509:
+ ret = gen12_configure_oar_context(stream, oa_config != NULL);
total: 0 errors, 0 warnings, 1 checks, 284 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
@ 2019-11-11 23:57 ` Patchwork
0 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2019-11-11 23:57 UTC (permalink / raw)
To: Umesh Nerlige Ramappa; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
URL : https://patchwork.freedesktop.org/series/69321/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_7311 -> Patchwork_15227
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_15227 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_15227, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/index.html
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_15227:
### IGT changes ###
#### Possible regressions ####
* igt@gem_render_tiled_blits@basic:
- fi-icl-dsi: [PASS][1] -> [DMESG-WARN][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-icl-dsi/igt@gem_render_tiled_blits@basic.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-icl-dsi/igt@gem_render_tiled_blits@basic.html
Known issues
------------
Here are the changes found in Patchwork_15227 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@kms_chamelium@hdmi-hpd-fast:
- fi-kbl-7500u: [PASS][3] -> [FAIL][4] ([fdo#111045] / [fdo#111096])
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
#### Possible fixes ####
* igt@i915_pm_rpm@module-reload:
- fi-skl-6770hq: [FAIL][5] ([fdo#108511]) -> [PASS][6]
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
* igt@kms_busy@basic-flip-pipe-b:
- fi-skl-6770hq: [DMESG-WARN][7] ([fdo#105541]) -> [PASS][8]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
#### Warnings ####
* igt@i915_selftest@live_gt_pm:
- fi-icl-guc: [DMESG-FAIL][9] ([fdo#112205]) -> [INCOMPLETE][10] ([fdo#107713])
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-icl-guc/igt@i915_selftest@live_gt_pm.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-icl-guc/igt@i915_selftest@live_gt_pm.html
[fdo#105541]: https://bugs.freedesktop.org/show_bug.cgi?id=105541
[fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
[fdo#108511]: https://bugs.freedesktop.org/show_bug.cgi?id=108511
[fdo#111045]: https://bugs.freedesktop.org/show_bug.cgi?id=111045
[fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
[fdo#112205]: https://bugs.freedesktop.org/show_bug.cgi?id=112205
Participating hosts (52 -> 43)
------------------------------
Missing (9): fi-ilk-m540 fi-hsw-4200u fi-byt-j1900 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-ivb-3770 fi-byt-clapper fi-bdw-samus
Build changes
-------------
* CI: CI-20190529 -> None
* Linux: CI_DRM_7311 -> Patchwork_15227
CI-20190529: 20190529
CI_DRM_7311: 36d31f70111ea87432ee8a8981943c5b20e36213 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5271: 05f0400c50af843df301efb5475e9f5e2d16a098 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_15227: c207c66a6167356c31c47ae3ee2de718113a310a @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
c207c66a6167 drm/i915/perf: Configure OAR for specific context
3b64a508eefb drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
@ 2019-11-11 23:57 ` Patchwork
0 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2019-11-11 23:57 UTC (permalink / raw)
To: Umesh Nerlige Ramappa; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
URL : https://patchwork.freedesktop.org/series/69321/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_7311 -> Patchwork_15227
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_15227 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_15227, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/index.html
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_15227:
### IGT changes ###
#### Possible regressions ####
* igt@gem_render_tiled_blits@basic:
- fi-icl-dsi: [PASS][1] -> [DMESG-WARN][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-icl-dsi/igt@gem_render_tiled_blits@basic.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-icl-dsi/igt@gem_render_tiled_blits@basic.html
Known issues
------------
Here are the changes found in Patchwork_15227 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@kms_chamelium@hdmi-hpd-fast:
- fi-kbl-7500u: [PASS][3] -> [FAIL][4] ([fdo#111045] / [fdo#111096])
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
#### Possible fixes ####
* igt@i915_pm_rpm@module-reload:
- fi-skl-6770hq: [FAIL][5] ([fdo#108511]) -> [PASS][6]
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
* igt@kms_busy@basic-flip-pipe-b:
- fi-skl-6770hq: [DMESG-WARN][7] ([fdo#105541]) -> [PASS][8]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
#### Warnings ####
* igt@i915_selftest@live_gt_pm:
- fi-icl-guc: [DMESG-FAIL][9] ([fdo#112205]) -> [INCOMPLETE][10] ([fdo#107713])
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7311/fi-icl-guc/igt@i915_selftest@live_gt_pm.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/fi-icl-guc/igt@i915_selftest@live_gt_pm.html
[fdo#105541]: https://bugs.freedesktop.org/show_bug.cgi?id=105541
[fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
[fdo#108511]: https://bugs.freedesktop.org/show_bug.cgi?id=108511
[fdo#111045]: https://bugs.freedesktop.org/show_bug.cgi?id=111045
[fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
[fdo#112205]: https://bugs.freedesktop.org/show_bug.cgi?id=112205
Participating hosts (52 -> 43)
------------------------------
Missing (9): fi-ilk-m540 fi-hsw-4200u fi-byt-j1900 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-ivb-3770 fi-byt-clapper fi-bdw-samus
Build changes
-------------
* CI: CI-20190529 -> None
* Linux: CI_DRM_7311 -> Patchwork_15227
CI-20190529: 20190529
CI_DRM_7311: 36d31f70111ea87432ee8a8981943c5b20e36213 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5271: 05f0400c50af843df301efb5475e9f5e2d16a098 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_15227: c207c66a6167356c31c47ae3ee2de718113a310a @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
c207c66a6167 drm/i915/perf: Configure OAR for specific context
3b64a508eefb drm/i915/perf: Allow non-privileged access when OA buffer is not sampled
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15227/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915/perf: Configure OAR for specific context
2019-11-18 13:42 ` Lionel Landwerlin
@ 2019-11-18 18:01 ` Umesh Nerlige Ramappa
0 siblings, 0 replies; 12+ messages in thread
From: Umesh Nerlige Ramappa @ 2019-11-18 18:01 UTC (permalink / raw)
To: Lionel Landwerlin; +Cc: intel-gfx, Chris Wilson
On Mon, Nov 18, 2019 at 03:42:28PM +0200, Lionel Landwerlin wrote:
>On 14/11/2019 21:21, Umesh Nerlige Ramappa wrote:
>>Gen12 supports saving/restoring render counters per context. Apply OAR
>>configuration only for the context that is passed in to perf.
>>
>>v2:
>>- Fix OACTXCONTROL value to only stop/resume counters.
>>- Remove gen12_update_reg_state_unlocked as power state is already
>> applied by the caller.
>>
>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>---
>> drivers/gpu/drm/i915/i915_perf.c | 193 +++++++++++++++++--------------
>> 1 file changed, 108 insertions(+), 85 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>>index 221c1090ae93..2f0be5fbef4b 100644
>>--- a/drivers/gpu/drm/i915/i915_perf.c
>>+++ b/drivers/gpu/drm/i915/i915_perf.c
>>@@ -2078,20 +2078,12 @@ gen8_update_reg_state_unlocked(const struct intel_context *ce,
>> u32 *reg_state = ce->lrc_reg_state;
>> int i;
>>- if (IS_GEN(stream->perf->i915, 12)) {
>>- u32 format = stream->oa_buffer.format;
>>+ reg_state[ctx_oactxctrl + 1] =
>>+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
>>+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
>>+ GEN8_OA_COUNTER_RESUME;
>>- reg_state[ctx_oactxctrl + 1] =
>>- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
>>- (stream->oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
>>- } else {
>>- reg_state[ctx_oactxctrl + 1] =
>>- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
>>- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
>>- GEN8_OA_COUNTER_RESUME;
>>- }
>>-
>>- for (i = 0; !!ctx_flexeu0 && i < ARRAY_SIZE(flex_regs); i++)
>>+ for (i = 0; i < ARRAY_SIZE(flex_regs); i++)
>> reg_state[ctx_flexeu0 + i * 2 + 1] =
>> oa_config_flex_reg(stream->oa_config, flex_regs[i]);
>>@@ -2224,34 +2216,49 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
>> return err;
>> }
>>-static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
>>+static int gen12_configure_oar_context(struct i915_perf_stream *stream, bool enable)
>> {
>>- struct i915_request *rq;
>>- u32 *cs;
>>- int err = 0;
>>-
>>- rq = i915_request_create(ce);
>>- if (IS_ERR(rq))
>>- return PTR_ERR(rq);
>>+ int err;
>>+ struct intel_context *ce = stream->pinned_ctx;
>>+ struct flex regs_context[] = {
>>+ {
>>+ GEN8_OACTXCONTROL,
>>+ stream->perf->ctx_oactxctrl_offset + 1,
>>+ enable ? GEN8_OA_COUNTER_RESUME : 0,
>>+ },
>>+ };
>
>
>When do we configure the Flex registers?
I don't see flex registers in the context image dump or in the spec, so
I guess we would only configure them if they are part of the metric set.
Will post a new version with below comments.
Thanks,
Umesh
>
>
>>+ struct flex regs_lri[] = {
>>+ {
>>+ GEN12_OAR_OACONTROL,
>>+ },
>>+ {
>>+ RING_CONTEXT_CONTROL(ce->engine->mmio_base),
>>+ },
>>+ };
>>+ u32 format = stream->oa_buffer.format;
>>- cs = intel_ring_begin(rq, 4);
>>- if (IS_ERR(cs)) {
>>- err = PTR_ERR(cs);
>>- goto out;
>>- }
>>+ regs_lri[0].value =
>>+ (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
>>+ (enable ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
>>- *cs++ = MI_LOAD_REGISTER_IMM(1);
>>- *cs++ = i915_mmio_reg_offset(RING_CONTEXT_CONTROL(ce->engine->mmio_base));
>>- *cs++ = _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
>>- enable ? GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE : 0);
>>- *cs++ = MI_NOOP;
>>+ regs_lri[1].value =
>>+ _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
>>+ enable ?
>>+ GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE :
>>+ 0);
>
>
>Can't we put those values in the array above?
>
>
>>- intel_ring_advance(rq, cs);
>>+ /* Modify the context image of pinned context with regs_context*/
>>+ err = intel_context_lock_pinned(ce);
>>+ if (err)
>>+ return err;
>>-out:
>>- i915_request_add(rq);
>>+ err = gen8_modify_context(ce, regs_context, ARRAY_SIZE(regs_context));
>>+ intel_context_unlock_pinned(ce);
>>+ if (err)
>>+ return err;
>>- return err;
>>+ /* Apply regs_lri using LRI with pinned context */
>>+ return gen8_modify_self(ce, regs_lri, ARRAY_SIZE(regs_lri));
>> }
>> /*
>>@@ -2277,53 +2284,16 @@ static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
>> * per-context OA state.
>> *
>> * Note: it's only the RCS/Render context that has any OA state.
>>+ * Note: the first flex register passed must always be R_PWR_CLK_STATE
>> */
>>-static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>>- const struct i915_oa_config *oa_config)
>>+static int oa_configure_all_contexts(struct i915_perf_stream *stream,
>>+ struct flex *regs,
>>+ size_t num_regs)
>> {
>> struct drm_i915_private *i915 = stream->perf->i915;
>>- /* The MMIO offsets for Flex EU registers aren't contiguous */
>>- const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
>>-#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
>>- struct flex regs[] = {
>>- {
>>- GEN8_R_PWR_CLK_STATE,
>>- CTX_R_PWR_CLK_STATE,
>>- },
>>- {
>>- IS_GEN(i915, 12) ?
>>- GEN12_OAR_OACONTROL : GEN8_OACTXCONTROL,
>>- stream->perf->ctx_oactxctrl_offset + 1,
>>- },
>>- { EU_PERF_CNTL0, ctx_flexeuN(0) },
>>- { EU_PERF_CNTL1, ctx_flexeuN(1) },
>>- { EU_PERF_CNTL2, ctx_flexeuN(2) },
>>- { EU_PERF_CNTL3, ctx_flexeuN(3) },
>>- { EU_PERF_CNTL4, ctx_flexeuN(4) },
>>- { EU_PERF_CNTL5, ctx_flexeuN(5) },
>>- { EU_PERF_CNTL6, ctx_flexeuN(6) },
>>- };
>>-#undef ctx_flexeuN
>> struct intel_engine_cs *engine;
>> struct i915_gem_context *ctx, *cn;
>>- size_t array_size = IS_GEN(i915, 12) ? 2 : ARRAY_SIZE(regs);
>>- int i, err;
>>-
>>- if (IS_GEN(i915, 12)) {
>>- u32 format = stream->oa_buffer.format;
>>-
>>- regs[1].value =
>>- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
>>- (oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
>>- } else {
>>- regs[1].value =
>>- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
>>- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
>>- GEN8_OA_COUNTER_RESUME;
>>- }
>>-
>>- for (i = 2; !!ctx_flexeu0 && i < array_size; i++)
>>- regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
>>+ int err;
>> lockdep_assert_held(&stream->perf->lock);
>>@@ -2353,7 +2323,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>> spin_unlock(&i915->gem.contexts.lock);
>>- err = gen8_configure_context(ctx, regs, array_size);
>>+ err = gen8_configure_context(ctx, regs, num_regs);
>> if (err) {
>> i915_gem_context_put(ctx);
>> return err;
>>@@ -2378,7 +2348,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>> regs[0].value = intel_sseu_make_rpcs(i915, &ce->sseu);
>>- err = gen8_modify_self(ce, regs, array_size);
>>+ err = gen8_modify_self(ce, regs, num_regs);
>> if (err)
>> return err;
>> }
>>@@ -2386,6 +2356,56 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>> return 0;
>> }
>>+static int gen12_configure_all_contexts(struct i915_perf_stream *stream,
>>+ const struct i915_oa_config *oa_config)
>>+{
>>+ struct flex regs[] = {
>>+ {
>>+ GEN8_R_PWR_CLK_STATE,
>>+ CTX_R_PWR_CLK_STATE,
>>+ },
>>+ };
>>+
>>+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
>>+}
>>+
>>+static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>>+ const struct i915_oa_config *oa_config)
>>+{
>>+ /* The MMIO offsets for Flex EU registers aren't contiguous */
>>+ const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
>>+#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
>>+ struct flex regs[] = {
>>+ {
>>+ GEN8_R_PWR_CLK_STATE,
>>+ CTX_R_PWR_CLK_STATE,
>>+ },
>>+ {
>>+ GEN8_OACTXCONTROL,
>>+ stream->perf->ctx_oactxctrl_offset + 1,
>>+ },
>>+ { EU_PERF_CNTL0, ctx_flexeuN(0) },
>>+ { EU_PERF_CNTL1, ctx_flexeuN(1) },
>>+ { EU_PERF_CNTL2, ctx_flexeuN(2) },
>>+ { EU_PERF_CNTL3, ctx_flexeuN(3) },
>>+ { EU_PERF_CNTL4, ctx_flexeuN(4) },
>>+ { EU_PERF_CNTL5, ctx_flexeuN(5) },
>>+ { EU_PERF_CNTL6, ctx_flexeuN(6) },
>>+ };
>>+#undef ctx_flexeuN
>>+ int i;
>>+
>>+ regs[1].value =
>>+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
>>+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
>>+ GEN8_OA_COUNTER_RESUME;
>>+
>>+ for (i = 2; i < ARRAY_SIZE(regs); i++)
>>+ regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
>>+
>>+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
>>+}
>>+
>> static int gen8_enable_metric_set(struct i915_perf_stream *stream)
>> {
>> struct intel_uncore *uncore = stream->uncore;
>>@@ -2469,7 +2489,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
>> * to make sure all slices/subslices are ON before writing to NOA
>> * registers.
>> */
>>- ret = lrc_configure_all_contexts(stream, oa_config);
>>+ ret = gen12_configure_all_contexts(stream, oa_config);
>> if (ret)
>> return ret;
>>@@ -2479,8 +2499,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
>> * requested this.
>> */
>> if (stream->ctx) {
>>- ret = gen12_emit_oar_config(stream->pinned_ctx,
>>- oa_config != NULL);
>>+ ret = gen12_configure_oar_context(stream, oa_config != NULL);
>
>
>I think you can assume oa_config is going to be != NULL in the
>enable_metric_set vfunc.
>
>
>> if (ret)
>> return ret;
>> }
>>@@ -2514,11 +2533,11 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream)
>> struct intel_uncore *uncore = stream->uncore;
>> /* Reset all contexts' slices/subslices configurations. */
>>- lrc_configure_all_contexts(stream, NULL);
>>+ gen12_configure_all_contexts(stream, NULL);
>> /* disable the context save/restore or OAR counters */
>> if (stream->ctx)
>>- gen12_emit_oar_config(stream->pinned_ctx, false);
>>+ gen12_configure_oar_context(stream, false);
>> /* Make sure we disable noa to save power. */
>> intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0);
>>@@ -2860,7 +2879,11 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
>> return;
>> stream = engine->i915->perf.exclusive_stream;
>>- if (stream)
>>+ /*
>>+ * For gen12, only CTX_R_PWR_CLK_STATE needs update, but the caller
>>+ * is already doing that, so nothing to be done for gen12 here.
>>+ */
>>+ if (stream && INTEL_GEN(stream->perf->i915) < 12)
>> gen8_update_reg_state_unlocked(ce, stream);
>> }
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915/perf: Configure OAR for specific context
2019-11-14 19:21 ` [PATCH 2/2] drm/i915/perf: Configure OAR for specific context Umesh Nerlige Ramappa
2019-11-14 19:45 ` Umesh Nerlige Ramappa
@ 2019-11-18 13:42 ` Lionel Landwerlin
2019-11-18 18:01 ` Umesh Nerlige Ramappa
1 sibling, 1 reply; 12+ messages in thread
From: Lionel Landwerlin @ 2019-11-18 13:42 UTC (permalink / raw)
To: Umesh Nerlige Ramappa, intel-gfx, Chris Wilson
On 14/11/2019 21:21, Umesh Nerlige Ramappa wrote:
> Gen12 supports saving/restoring render counters per context. Apply OAR
> configuration only for the context that is passed in to perf.
>
> v2:
> - Fix OACTXCONTROL value to only stop/resume counters.
> - Remove gen12_update_reg_state_unlocked as power state is already
> applied by the caller.
>
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
> drivers/gpu/drm/i915/i915_perf.c | 193 +++++++++++++++++--------------
> 1 file changed, 108 insertions(+), 85 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 221c1090ae93..2f0be5fbef4b 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -2078,20 +2078,12 @@ gen8_update_reg_state_unlocked(const struct intel_context *ce,
> u32 *reg_state = ce->lrc_reg_state;
> int i;
>
> - if (IS_GEN(stream->perf->i915, 12)) {
> - u32 format = stream->oa_buffer.format;
> + reg_state[ctx_oactxctrl + 1] =
> + (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
> + (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
> + GEN8_OA_COUNTER_RESUME;
>
> - reg_state[ctx_oactxctrl + 1] =
> - (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
> - (stream->oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
> - } else {
> - reg_state[ctx_oactxctrl + 1] =
> - (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
> - (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
> - GEN8_OA_COUNTER_RESUME;
> - }
> -
> - for (i = 0; !!ctx_flexeu0 && i < ARRAY_SIZE(flex_regs); i++)
> + for (i = 0; i < ARRAY_SIZE(flex_regs); i++)
> reg_state[ctx_flexeu0 + i * 2 + 1] =
> oa_config_flex_reg(stream->oa_config, flex_regs[i]);
>
> @@ -2224,34 +2216,49 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
> return err;
> }
>
> -static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
> +static int gen12_configure_oar_context(struct i915_perf_stream *stream, bool enable)
> {
> - struct i915_request *rq;
> - u32 *cs;
> - int err = 0;
> -
> - rq = i915_request_create(ce);
> - if (IS_ERR(rq))
> - return PTR_ERR(rq);
> + int err;
> + struct intel_context *ce = stream->pinned_ctx;
> + struct flex regs_context[] = {
> + {
> + GEN8_OACTXCONTROL,
> + stream->perf->ctx_oactxctrl_offset + 1,
> + enable ? GEN8_OA_COUNTER_RESUME : 0,
> + },
> + };
When do we configure the Flex registers?
> + struct flex regs_lri[] = {
> + {
> + GEN12_OAR_OACONTROL,
> + },
> + {
> + RING_CONTEXT_CONTROL(ce->engine->mmio_base),
> + },
> + };
> + u32 format = stream->oa_buffer.format;
>
> - cs = intel_ring_begin(rq, 4);
> - if (IS_ERR(cs)) {
> - err = PTR_ERR(cs);
> - goto out;
> - }
> + regs_lri[0].value =
> + (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
> + (enable ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
>
> - *cs++ = MI_LOAD_REGISTER_IMM(1);
> - *cs++ = i915_mmio_reg_offset(RING_CONTEXT_CONTROL(ce->engine->mmio_base));
> - *cs++ = _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
> - enable ? GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE : 0);
> - *cs++ = MI_NOOP;
> + regs_lri[1].value =
> + _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
> + enable ?
> + GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE :
> + 0);
Can't we put those values in the array above?
>
> - intel_ring_advance(rq, cs);
> + /* Modify the context image of pinned context with regs_context*/
> + err = intel_context_lock_pinned(ce);
> + if (err)
> + return err;
>
> -out:
> - i915_request_add(rq);
> + err = gen8_modify_context(ce, regs_context, ARRAY_SIZE(regs_context));
> + intel_context_unlock_pinned(ce);
> + if (err)
> + return err;
>
> - return err;
> + /* Apply regs_lri using LRI with pinned context */
> + return gen8_modify_self(ce, regs_lri, ARRAY_SIZE(regs_lri));
> }
>
> /*
> @@ -2277,53 +2284,16 @@ static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
> * per-context OA state.
> *
> * Note: it's only the RCS/Render context that has any OA state.
> + * Note: the first flex register passed must always be R_PWR_CLK_STATE
> */
> -static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
> - const struct i915_oa_config *oa_config)
> +static int oa_configure_all_contexts(struct i915_perf_stream *stream,
> + struct flex *regs,
> + size_t num_regs)
> {
> struct drm_i915_private *i915 = stream->perf->i915;
> - /* The MMIO offsets for Flex EU registers aren't contiguous */
> - const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
> -#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
> - struct flex regs[] = {
> - {
> - GEN8_R_PWR_CLK_STATE,
> - CTX_R_PWR_CLK_STATE,
> - },
> - {
> - IS_GEN(i915, 12) ?
> - GEN12_OAR_OACONTROL : GEN8_OACTXCONTROL,
> - stream->perf->ctx_oactxctrl_offset + 1,
> - },
> - { EU_PERF_CNTL0, ctx_flexeuN(0) },
> - { EU_PERF_CNTL1, ctx_flexeuN(1) },
> - { EU_PERF_CNTL2, ctx_flexeuN(2) },
> - { EU_PERF_CNTL3, ctx_flexeuN(3) },
> - { EU_PERF_CNTL4, ctx_flexeuN(4) },
> - { EU_PERF_CNTL5, ctx_flexeuN(5) },
> - { EU_PERF_CNTL6, ctx_flexeuN(6) },
> - };
> -#undef ctx_flexeuN
> struct intel_engine_cs *engine;
> struct i915_gem_context *ctx, *cn;
> - size_t array_size = IS_GEN(i915, 12) ? 2 : ARRAY_SIZE(regs);
> - int i, err;
> -
> - if (IS_GEN(i915, 12)) {
> - u32 format = stream->oa_buffer.format;
> -
> - regs[1].value =
> - (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
> - (oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
> - } else {
> - regs[1].value =
> - (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
> - (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
> - GEN8_OA_COUNTER_RESUME;
> - }
> -
> - for (i = 2; !!ctx_flexeu0 && i < array_size; i++)
> - regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
> + int err;
>
> lockdep_assert_held(&stream->perf->lock);
>
> @@ -2353,7 +2323,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>
> spin_unlock(&i915->gem.contexts.lock);
>
> - err = gen8_configure_context(ctx, regs, array_size);
> + err = gen8_configure_context(ctx, regs, num_regs);
> if (err) {
> i915_gem_context_put(ctx);
> return err;
> @@ -2378,7 +2348,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>
> regs[0].value = intel_sseu_make_rpcs(i915, &ce->sseu);
>
> - err = gen8_modify_self(ce, regs, array_size);
> + err = gen8_modify_self(ce, regs, num_regs);
> if (err)
> return err;
> }
> @@ -2386,6 +2356,56 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
> return 0;
> }
>
> +static int gen12_configure_all_contexts(struct i915_perf_stream *stream,
> + const struct i915_oa_config *oa_config)
> +{
> + struct flex regs[] = {
> + {
> + GEN8_R_PWR_CLK_STATE,
> + CTX_R_PWR_CLK_STATE,
> + },
> + };
> +
> + return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
> +}
> +
> +static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
> + const struct i915_oa_config *oa_config)
> +{
> + /* The MMIO offsets for Flex EU registers aren't contiguous */
> + const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
> +#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
> + struct flex regs[] = {
> + {
> + GEN8_R_PWR_CLK_STATE,
> + CTX_R_PWR_CLK_STATE,
> + },
> + {
> + GEN8_OACTXCONTROL,
> + stream->perf->ctx_oactxctrl_offset + 1,
> + },
> + { EU_PERF_CNTL0, ctx_flexeuN(0) },
> + { EU_PERF_CNTL1, ctx_flexeuN(1) },
> + { EU_PERF_CNTL2, ctx_flexeuN(2) },
> + { EU_PERF_CNTL3, ctx_flexeuN(3) },
> + { EU_PERF_CNTL4, ctx_flexeuN(4) },
> + { EU_PERF_CNTL5, ctx_flexeuN(5) },
> + { EU_PERF_CNTL6, ctx_flexeuN(6) },
> + };
> +#undef ctx_flexeuN
> + int i;
> +
> + regs[1].value =
> + (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
> + (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
> + GEN8_OA_COUNTER_RESUME;
> +
> + for (i = 2; i < ARRAY_SIZE(regs); i++)
> + regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
> +
> + return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
> +}
> +
> static int gen8_enable_metric_set(struct i915_perf_stream *stream)
> {
> struct intel_uncore *uncore = stream->uncore;
> @@ -2469,7 +2489,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
> * to make sure all slices/subslices are ON before writing to NOA
> * registers.
> */
> - ret = lrc_configure_all_contexts(stream, oa_config);
> + ret = gen12_configure_all_contexts(stream, oa_config);
> if (ret)
> return ret;
>
> @@ -2479,8 +2499,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
> * requested this.
> */
> if (stream->ctx) {
> - ret = gen12_emit_oar_config(stream->pinned_ctx,
> - oa_config != NULL);
> + ret = gen12_configure_oar_context(stream, oa_config != NULL);
I think you can assume oa_config is going to be != NULL in the
enable_metric_set vfunc.
> if (ret)
> return ret;
> }
> @@ -2514,11 +2533,11 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream)
> struct intel_uncore *uncore = stream->uncore;
>
> /* Reset all contexts' slices/subslices configurations. */
> - lrc_configure_all_contexts(stream, NULL);
> + gen12_configure_all_contexts(stream, NULL);
>
> /* disable the context save/restore or OAR counters */
> if (stream->ctx)
> - gen12_emit_oar_config(stream->pinned_ctx, false);
> + gen12_configure_oar_context(stream, false);
>
> /* Make sure we disable noa to save power. */
> intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0);
> @@ -2860,7 +2879,11 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
> return;
>
> stream = engine->i915->perf.exclusive_stream;
> - if (stream)
> + /*
> + * For gen12, only CTX_R_PWR_CLK_STATE needs update, but the caller
> + * is already doing that, so nothing to be done for gen12 here.
> + */
> + if (stream && INTEL_GEN(stream->perf->i915) < 12)
> gen8_update_reg_state_unlocked(ce, stream);
> }
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915/perf: Configure OAR for specific context
2019-11-14 19:21 ` [PATCH 2/2] drm/i915/perf: Configure OAR for specific context Umesh Nerlige Ramappa
@ 2019-11-14 19:45 ` Umesh Nerlige Ramappa
2019-11-18 13:42 ` Lionel Landwerlin
1 sibling, 0 replies; 12+ messages in thread
From: Umesh Nerlige Ramappa @ 2019-11-14 19:45 UTC (permalink / raw)
To: intel-gfx, Lionel G Landwerlin, Chris Wilson
On Thu, Nov 14, 2019 at 11:21:14AM -0800, Umesh Nerlige Ramappa wrote:
>Gen12 supports saving/restoring render counters per context. Apply OAR
>configuration only for the context that is passed in to perf.
>
>v2:
>- Fix OACTXCONTROL value to only stop/resume counters.
>- Remove gen12_update_reg_state_unlocked as power state is already
> applied by the caller.
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
IGT Tests in this series:
https://patchwork.freedesktop.org/series/69322/
gen12-mi-rpc
gen12-unprivileged-single-ctx-counters
Thanks,
Umesh
>---
> drivers/gpu/drm/i915/i915_perf.c | 193 +++++++++++++++++--------------
> 1 file changed, 108 insertions(+), 85 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>index 221c1090ae93..2f0be5fbef4b 100644
>--- a/drivers/gpu/drm/i915/i915_perf.c
>+++ b/drivers/gpu/drm/i915/i915_perf.c
>@@ -2078,20 +2078,12 @@ gen8_update_reg_state_unlocked(const struct intel_context *ce,
> u32 *reg_state = ce->lrc_reg_state;
> int i;
>
>- if (IS_GEN(stream->perf->i915, 12)) {
>- u32 format = stream->oa_buffer.format;
>+ reg_state[ctx_oactxctrl + 1] =
>+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
>+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
>+ GEN8_OA_COUNTER_RESUME;
>
>- reg_state[ctx_oactxctrl + 1] =
>- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
>- (stream->oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
>- } else {
>- reg_state[ctx_oactxctrl + 1] =
>- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
>- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
>- GEN8_OA_COUNTER_RESUME;
>- }
>-
>- for (i = 0; !!ctx_flexeu0 && i < ARRAY_SIZE(flex_regs); i++)
>+ for (i = 0; i < ARRAY_SIZE(flex_regs); i++)
> reg_state[ctx_flexeu0 + i * 2 + 1] =
> oa_config_flex_reg(stream->oa_config, flex_regs[i]);
>
>@@ -2224,34 +2216,49 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
> return err;
> }
>
>-static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
>+static int gen12_configure_oar_context(struct i915_perf_stream *stream, bool enable)
> {
>- struct i915_request *rq;
>- u32 *cs;
>- int err = 0;
>-
>- rq = i915_request_create(ce);
>- if (IS_ERR(rq))
>- return PTR_ERR(rq);
>+ int err;
>+ struct intel_context *ce = stream->pinned_ctx;
>+ struct flex regs_context[] = {
>+ {
>+ GEN8_OACTXCONTROL,
>+ stream->perf->ctx_oactxctrl_offset + 1,
>+ enable ? GEN8_OA_COUNTER_RESUME : 0,
>+ },
>+ };
>+ struct flex regs_lri[] = {
>+ {
>+ GEN12_OAR_OACONTROL,
>+ },
>+ {
>+ RING_CONTEXT_CONTROL(ce->engine->mmio_base),
>+ },
>+ };
>+ u32 format = stream->oa_buffer.format;
>
>- cs = intel_ring_begin(rq, 4);
>- if (IS_ERR(cs)) {
>- err = PTR_ERR(cs);
>- goto out;
>- }
>+ regs_lri[0].value =
>+ (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
>+ (enable ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
>
>- *cs++ = MI_LOAD_REGISTER_IMM(1);
>- *cs++ = i915_mmio_reg_offset(RING_CONTEXT_CONTROL(ce->engine->mmio_base));
>- *cs++ = _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
>- enable ? GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE : 0);
>- *cs++ = MI_NOOP;
>+ regs_lri[1].value =
>+ _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
>+ enable ?
>+ GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE :
>+ 0);
>
>- intel_ring_advance(rq, cs);
>+ /* Modify the context image of pinned context with regs_context*/
>+ err = intel_context_lock_pinned(ce);
>+ if (err)
>+ return err;
>
>-out:
>- i915_request_add(rq);
>+ err = gen8_modify_context(ce, regs_context, ARRAY_SIZE(regs_context));
>+ intel_context_unlock_pinned(ce);
>+ if (err)
>+ return err;
>
>- return err;
>+ /* Apply regs_lri using LRI with pinned context */
>+ return gen8_modify_self(ce, regs_lri, ARRAY_SIZE(regs_lri));
> }
>
> /*
>@@ -2277,53 +2284,16 @@ static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
> * per-context OA state.
> *
> * Note: it's only the RCS/Render context that has any OA state.
>+ * Note: the first flex register passed must always be R_PWR_CLK_STATE
> */
>-static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>- const struct i915_oa_config *oa_config)
>+static int oa_configure_all_contexts(struct i915_perf_stream *stream,
>+ struct flex *regs,
>+ size_t num_regs)
> {
> struct drm_i915_private *i915 = stream->perf->i915;
>- /* The MMIO offsets for Flex EU registers aren't contiguous */
>- const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
>-#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
>- struct flex regs[] = {
>- {
>- GEN8_R_PWR_CLK_STATE,
>- CTX_R_PWR_CLK_STATE,
>- },
>- {
>- IS_GEN(i915, 12) ?
>- GEN12_OAR_OACONTROL : GEN8_OACTXCONTROL,
>- stream->perf->ctx_oactxctrl_offset + 1,
>- },
>- { EU_PERF_CNTL0, ctx_flexeuN(0) },
>- { EU_PERF_CNTL1, ctx_flexeuN(1) },
>- { EU_PERF_CNTL2, ctx_flexeuN(2) },
>- { EU_PERF_CNTL3, ctx_flexeuN(3) },
>- { EU_PERF_CNTL4, ctx_flexeuN(4) },
>- { EU_PERF_CNTL5, ctx_flexeuN(5) },
>- { EU_PERF_CNTL6, ctx_flexeuN(6) },
>- };
>-#undef ctx_flexeuN
> struct intel_engine_cs *engine;
> struct i915_gem_context *ctx, *cn;
>- size_t array_size = IS_GEN(i915, 12) ? 2 : ARRAY_SIZE(regs);
>- int i, err;
>-
>- if (IS_GEN(i915, 12)) {
>- u32 format = stream->oa_buffer.format;
>-
>- regs[1].value =
>- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
>- (oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
>- } else {
>- regs[1].value =
>- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
>- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
>- GEN8_OA_COUNTER_RESUME;
>- }
>-
>- for (i = 2; !!ctx_flexeu0 && i < array_size; i++)
>- regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
>+ int err;
>
> lockdep_assert_held(&stream->perf->lock);
>
>@@ -2353,7 +2323,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>
> spin_unlock(&i915->gem.contexts.lock);
>
>- err = gen8_configure_context(ctx, regs, array_size);
>+ err = gen8_configure_context(ctx, regs, num_regs);
> if (err) {
> i915_gem_context_put(ctx);
> return err;
>@@ -2378,7 +2348,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>
> regs[0].value = intel_sseu_make_rpcs(i915, &ce->sseu);
>
>- err = gen8_modify_self(ce, regs, array_size);
>+ err = gen8_modify_self(ce, regs, num_regs);
> if (err)
> return err;
> }
>@@ -2386,6 +2356,56 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
> return 0;
> }
>
>+static int gen12_configure_all_contexts(struct i915_perf_stream *stream,
>+ const struct i915_oa_config *oa_config)
>+{
>+ struct flex regs[] = {
>+ {
>+ GEN8_R_PWR_CLK_STATE,
>+ CTX_R_PWR_CLK_STATE,
>+ },
>+ };
>+
>+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
>+}
>+
>+static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
>+ const struct i915_oa_config *oa_config)
>+{
>+ /* The MMIO offsets for Flex EU registers aren't contiguous */
>+ const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
>+#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
>+ struct flex regs[] = {
>+ {
>+ GEN8_R_PWR_CLK_STATE,
>+ CTX_R_PWR_CLK_STATE,
>+ },
>+ {
>+ GEN8_OACTXCONTROL,
>+ stream->perf->ctx_oactxctrl_offset + 1,
>+ },
>+ { EU_PERF_CNTL0, ctx_flexeuN(0) },
>+ { EU_PERF_CNTL1, ctx_flexeuN(1) },
>+ { EU_PERF_CNTL2, ctx_flexeuN(2) },
>+ { EU_PERF_CNTL3, ctx_flexeuN(3) },
>+ { EU_PERF_CNTL4, ctx_flexeuN(4) },
>+ { EU_PERF_CNTL5, ctx_flexeuN(5) },
>+ { EU_PERF_CNTL6, ctx_flexeuN(6) },
>+ };
>+#undef ctx_flexeuN
>+ int i;
>+
>+ regs[1].value =
>+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
>+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
>+ GEN8_OA_COUNTER_RESUME;
>+
>+ for (i = 2; i < ARRAY_SIZE(regs); i++)
>+ regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
>+
>+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
>+}
>+
> static int gen8_enable_metric_set(struct i915_perf_stream *stream)
> {
> struct intel_uncore *uncore = stream->uncore;
>@@ -2469,7 +2489,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
> * to make sure all slices/subslices are ON before writing to NOA
> * registers.
> */
>- ret = lrc_configure_all_contexts(stream, oa_config);
>+ ret = gen12_configure_all_contexts(stream, oa_config);
> if (ret)
> return ret;
>
>@@ -2479,8 +2499,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
> * requested this.
> */
> if (stream->ctx) {
>- ret = gen12_emit_oar_config(stream->pinned_ctx,
>- oa_config != NULL);
>+ ret = gen12_configure_oar_context(stream, oa_config != NULL);
> if (ret)
> return ret;
> }
>@@ -2514,11 +2533,11 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream)
> struct intel_uncore *uncore = stream->uncore;
>
> /* Reset all contexts' slices/subslices configurations. */
>- lrc_configure_all_contexts(stream, NULL);
>+ gen12_configure_all_contexts(stream, NULL);
>
> /* disable the context save/restore or OAR counters */
> if (stream->ctx)
>- gen12_emit_oar_config(stream->pinned_ctx, false);
>+ gen12_configure_oar_context(stream, false);
>
> /* Make sure we disable noa to save power. */
> intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0);
>@@ -2860,7 +2879,11 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
> return;
>
> stream = engine->i915->perf.exclusive_stream;
>- if (stream)
>+ /*
>+ * For gen12, only CTX_R_PWR_CLK_STATE needs update, but the caller
>+ * is already doing that, so nothing to be done for gen12 here.
>+ */
>+ if (stream && INTEL_GEN(stream->perf->i915) < 12)
> gen8_update_reg_state_unlocked(ce, stream);
> }
>
>--
>2.20.1
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] drm/i915/perf: Configure OAR for specific context
2019-11-14 19:21 [PATCH 1/2] " Umesh Nerlige Ramappa
@ 2019-11-14 19:21 ` Umesh Nerlige Ramappa
2019-11-14 19:45 ` Umesh Nerlige Ramappa
2019-11-18 13:42 ` Lionel Landwerlin
0 siblings, 2 replies; 12+ messages in thread
From: Umesh Nerlige Ramappa @ 2019-11-14 19:21 UTC (permalink / raw)
To: intel-gfx, Lionel G Landwerlin, Chris Wilson
Gen12 supports saving/restoring render counters per context. Apply OAR
configuration only for the context that is passed in to perf.
v2:
- Fix OACTXCONTROL value to only stop/resume counters.
- Remove gen12_update_reg_state_unlocked as power state is already
applied by the caller.
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
drivers/gpu/drm/i915/i915_perf.c | 193 +++++++++++++++++--------------
1 file changed, 108 insertions(+), 85 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 221c1090ae93..2f0be5fbef4b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -2078,20 +2078,12 @@ gen8_update_reg_state_unlocked(const struct intel_context *ce,
u32 *reg_state = ce->lrc_reg_state;
int i;
- if (IS_GEN(stream->perf->i915, 12)) {
- u32 format = stream->oa_buffer.format;
+ reg_state[ctx_oactxctrl + 1] =
+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
+ GEN8_OA_COUNTER_RESUME;
- reg_state[ctx_oactxctrl + 1] =
- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
- (stream->oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- } else {
- reg_state[ctx_oactxctrl + 1] =
- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
- GEN8_OA_COUNTER_RESUME;
- }
-
- for (i = 0; !!ctx_flexeu0 && i < ARRAY_SIZE(flex_regs); i++)
+ for (i = 0; i < ARRAY_SIZE(flex_regs); i++)
reg_state[ctx_flexeu0 + i * 2 + 1] =
oa_config_flex_reg(stream->oa_config, flex_regs[i]);
@@ -2224,34 +2216,49 @@ static int gen8_configure_context(struct i915_gem_context *ctx,
return err;
}
-static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
+static int gen12_configure_oar_context(struct i915_perf_stream *stream, bool enable)
{
- struct i915_request *rq;
- u32 *cs;
- int err = 0;
-
- rq = i915_request_create(ce);
- if (IS_ERR(rq))
- return PTR_ERR(rq);
+ int err;
+ struct intel_context *ce = stream->pinned_ctx;
+ struct flex regs_context[] = {
+ {
+ GEN8_OACTXCONTROL,
+ stream->perf->ctx_oactxctrl_offset + 1,
+ enable ? GEN8_OA_COUNTER_RESUME : 0,
+ },
+ };
+ struct flex regs_lri[] = {
+ {
+ GEN12_OAR_OACONTROL,
+ },
+ {
+ RING_CONTEXT_CONTROL(ce->engine->mmio_base),
+ },
+ };
+ u32 format = stream->oa_buffer.format;
- cs = intel_ring_begin(rq, 4);
- if (IS_ERR(cs)) {
- err = PTR_ERR(cs);
- goto out;
- }
+ regs_lri[0].value =
+ (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
+ (enable ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- *cs++ = MI_LOAD_REGISTER_IMM(1);
- *cs++ = i915_mmio_reg_offset(RING_CONTEXT_CONTROL(ce->engine->mmio_base));
- *cs++ = _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
- enable ? GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE : 0);
- *cs++ = MI_NOOP;
+ regs_lri[1].value =
+ _MASKED_FIELD(GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE,
+ enable ?
+ GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE :
+ 0);
- intel_ring_advance(rq, cs);
+ /* Modify the context image of pinned context with regs_context*/
+ err = intel_context_lock_pinned(ce);
+ if (err)
+ return err;
-out:
- i915_request_add(rq);
+ err = gen8_modify_context(ce, regs_context, ARRAY_SIZE(regs_context));
+ intel_context_unlock_pinned(ce);
+ if (err)
+ return err;
- return err;
+ /* Apply regs_lri using LRI with pinned context */
+ return gen8_modify_self(ce, regs_lri, ARRAY_SIZE(regs_lri));
}
/*
@@ -2277,53 +2284,16 @@ static int gen12_emit_oar_config(struct intel_context *ce, bool enable)
* per-context OA state.
*
* Note: it's only the RCS/Render context that has any OA state.
+ * Note: the first flex register passed must always be R_PWR_CLK_STATE
*/
-static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
- const struct i915_oa_config *oa_config)
+static int oa_configure_all_contexts(struct i915_perf_stream *stream,
+ struct flex *regs,
+ size_t num_regs)
{
struct drm_i915_private *i915 = stream->perf->i915;
- /* The MMIO offsets for Flex EU registers aren't contiguous */
- const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
-#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
- struct flex regs[] = {
- {
- GEN8_R_PWR_CLK_STATE,
- CTX_R_PWR_CLK_STATE,
- },
- {
- IS_GEN(i915, 12) ?
- GEN12_OAR_OACONTROL : GEN8_OACTXCONTROL,
- stream->perf->ctx_oactxctrl_offset + 1,
- },
- { EU_PERF_CNTL0, ctx_flexeuN(0) },
- { EU_PERF_CNTL1, ctx_flexeuN(1) },
- { EU_PERF_CNTL2, ctx_flexeuN(2) },
- { EU_PERF_CNTL3, ctx_flexeuN(3) },
- { EU_PERF_CNTL4, ctx_flexeuN(4) },
- { EU_PERF_CNTL5, ctx_flexeuN(5) },
- { EU_PERF_CNTL6, ctx_flexeuN(6) },
- };
-#undef ctx_flexeuN
struct intel_engine_cs *engine;
struct i915_gem_context *ctx, *cn;
- size_t array_size = IS_GEN(i915, 12) ? 2 : ARRAY_SIZE(regs);
- int i, err;
-
- if (IS_GEN(i915, 12)) {
- u32 format = stream->oa_buffer.format;
-
- regs[1].value =
- (format << GEN12_OAR_OACONTROL_COUNTER_FORMAT_SHIFT) |
- (oa_config ? GEN12_OAR_OACONTROL_COUNTER_ENABLE : 0);
- } else {
- regs[1].value =
- (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
- (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
- GEN8_OA_COUNTER_RESUME;
- }
-
- for (i = 2; !!ctx_flexeu0 && i < array_size; i++)
- regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
+ int err;
lockdep_assert_held(&stream->perf->lock);
@@ -2353,7 +2323,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
spin_unlock(&i915->gem.contexts.lock);
- err = gen8_configure_context(ctx, regs, array_size);
+ err = gen8_configure_context(ctx, regs, num_regs);
if (err) {
i915_gem_context_put(ctx);
return err;
@@ -2378,7 +2348,7 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
regs[0].value = intel_sseu_make_rpcs(i915, &ce->sseu);
- err = gen8_modify_self(ce, regs, array_size);
+ err = gen8_modify_self(ce, regs, num_regs);
if (err)
return err;
}
@@ -2386,6 +2356,56 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
return 0;
}
+static int gen12_configure_all_contexts(struct i915_perf_stream *stream,
+ const struct i915_oa_config *oa_config)
+{
+ struct flex regs[] = {
+ {
+ GEN8_R_PWR_CLK_STATE,
+ CTX_R_PWR_CLK_STATE,
+ },
+ };
+
+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
+}
+
+static int lrc_configure_all_contexts(struct i915_perf_stream *stream,
+ const struct i915_oa_config *oa_config)
+{
+ /* The MMIO offsets for Flex EU registers aren't contiguous */
+ const u32 ctx_flexeu0 = stream->perf->ctx_flexeu0_offset;
+#define ctx_flexeuN(N) (ctx_flexeu0 + 2 * (N) + 1)
+ struct flex regs[] = {
+ {
+ GEN8_R_PWR_CLK_STATE,
+ CTX_R_PWR_CLK_STATE,
+ },
+ {
+ GEN8_OACTXCONTROL,
+ stream->perf->ctx_oactxctrl_offset + 1,
+ },
+ { EU_PERF_CNTL0, ctx_flexeuN(0) },
+ { EU_PERF_CNTL1, ctx_flexeuN(1) },
+ { EU_PERF_CNTL2, ctx_flexeuN(2) },
+ { EU_PERF_CNTL3, ctx_flexeuN(3) },
+ { EU_PERF_CNTL4, ctx_flexeuN(4) },
+ { EU_PERF_CNTL5, ctx_flexeuN(5) },
+ { EU_PERF_CNTL6, ctx_flexeuN(6) },
+ };
+#undef ctx_flexeuN
+ int i;
+
+ regs[1].value =
+ (stream->period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
+ (stream->periodic ? GEN8_OA_TIMER_ENABLE : 0) |
+ GEN8_OA_COUNTER_RESUME;
+
+ for (i = 2; i < ARRAY_SIZE(regs); i++)
+ regs[i].value = oa_config_flex_reg(oa_config, regs[i].reg);
+
+ return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs));
+}
+
static int gen8_enable_metric_set(struct i915_perf_stream *stream)
{
struct intel_uncore *uncore = stream->uncore;
@@ -2469,7 +2489,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
* to make sure all slices/subslices are ON before writing to NOA
* registers.
*/
- ret = lrc_configure_all_contexts(stream, oa_config);
+ ret = gen12_configure_all_contexts(stream, oa_config);
if (ret)
return ret;
@@ -2479,8 +2499,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream)
* requested this.
*/
if (stream->ctx) {
- ret = gen12_emit_oar_config(stream->pinned_ctx,
- oa_config != NULL);
+ ret = gen12_configure_oar_context(stream, oa_config != NULL);
if (ret)
return ret;
}
@@ -2514,11 +2533,11 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream)
struct intel_uncore *uncore = stream->uncore;
/* Reset all contexts' slices/subslices configurations. */
- lrc_configure_all_contexts(stream, NULL);
+ gen12_configure_all_contexts(stream, NULL);
/* disable the context save/restore or OAR counters */
if (stream->ctx)
- gen12_emit_oar_config(stream->pinned_ctx, false);
+ gen12_configure_oar_context(stream, false);
/* Make sure we disable noa to save power. */
intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0);
@@ -2860,7 +2879,11 @@ void i915_oa_init_reg_state(const struct intel_context *ce,
return;
stream = engine->i915->perf.exclusive_stream;
- if (stream)
+ /*
+ * For gen12, only CTX_R_PWR_CLK_STATE needs update, but the caller
+ * is already doing that, so nothing to be done for gen12 here.
+ */
+ if (stream && INTEL_GEN(stream->perf->i915) < 12)
gen8_update_reg_state_unlocked(ce, stream);
}
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-11-18 18:01 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-11 22:09 [PATCH 1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled Umesh Nerlige Ramappa
2019-11-11 22:09 ` [Intel-gfx] " Umesh Nerlige Ramappa
2019-11-11 22:09 ` [PATCH 2/2] drm/i915/perf: Configure OAR for specific context Umesh Nerlige Ramappa
2019-11-11 22:09 ` [Intel-gfx] " Umesh Nerlige Ramappa
2019-11-11 23:32 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/perf: Allow non-privileged access when OA buffer is not sampled Patchwork
2019-11-11 23:32 ` [Intel-gfx] " Patchwork
2019-11-11 23:57 ` ✗ Fi.CI.BAT: failure " Patchwork
2019-11-11 23:57 ` [Intel-gfx] " Patchwork
2019-11-14 19:21 [PATCH 1/2] " Umesh Nerlige Ramappa
2019-11-14 19:21 ` [PATCH 2/2] drm/i915/perf: Configure OAR for specific context Umesh Nerlige Ramappa
2019-11-14 19:45 ` Umesh Nerlige Ramappa
2019-11-18 13:42 ` Lionel Landwerlin
2019-11-18 18:01 ` Umesh Nerlige Ramappa
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.