All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michel Thierry <michel.thierry@intel.com>
To: intel-gfx@lists.freedesktop.org
Subject: [PATCH v8 12/20] drm/i915/guc: Provide register list to be saved/restored during engine reset
Date: Mon, 22 May 2017 10:46:33 -0700	[thread overview]
Message-ID: <20170522174641.25354-13-michel.thierry@intel.com> (raw)
In-Reply-To: <20170522174641.25354-1-michel.thierry@intel.com>

GuC expects a list of registers from the driver which are saved/restored
during engine reset. The type of value to be saved is controlled by
flags. We provide a minimal set of registers that we want GuC to save and
restore. This is not an issue in case of engine reset as driver initializes
most of them following an engine reset, but in case of media reset (aka
watchdog reset) which is completely internal to GuC (including resubmission
of hung workload), it is necessary to provide this list, otherwise GuC won't
be able to schedule further workloads after a reset. This is the minimal
set of registers identified for things to work as expected but if we see
any new issues, this register list can be expanded.

In order to not loose any existing workarounds, we have to let GuC know
the registers and its values. These will be reapplied after the reset.
Note that we can't just read the current value because most of these
registers are masked (so we have a workaround for a workaround for a
workaround).

v2: REGSET_MASKED is too difficult for GuC, use REGSET_SAVE_DEFAULT_VALUE
and current value from RING_MODE reg instead; no need to preserve
head/tail either, be extra paranoid and save whitelisted registers (Daniele).

v3: Workarounds added only once during _init_workarounds also have to
been restored, or we risk loosing them after internal GuC reset
(Daniele).

v4: Rename macro used to keep track the workaround registers we will
have to restore after reset (s/I915_GUC_REG_WRITE/WA_REG_WR_GUC_RESTORE).

v5: Only ask guc to reapply workarounds in case of render reset (Daniele).

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  3 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 70 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_engine_cs.c     | 65 ++++++++++++++++++---------
 3 files changed, 116 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ada9e2cc969c..28f48678c91f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1869,7 +1869,10 @@ struct i915_wa_reg {
 
 struct i915_workarounds {
 	struct i915_wa_reg reg[I915_MAX_WA_REGS];
+	/* list of registers (and their values) that GuC will have to restore */
+	struct i915_wa_reg guc_reg[GUC_REGSET_MAX_REGISTERS];
 	u32 count;
+	u32 guc_count;
 	u32 hw_whitelist_count[I915_NUM_ENGINES];
 };
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 50d3e5aaf566..8650b6ec2f2d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1017,6 +1017,24 @@ static void guc_policies_init(struct guc_policies *policies)
 	policies->is_valid = 1;
 }
 
+/*
+ * In this macro it is highly unlikely to exceed max value but even if we did
+ * it is not an error so just throw a warning and continue. Only side effect
+ * in continuing further means some registers won't be added to save/restore
+ * list.
+ */
+#define GUC_ADD_MMIO_REG_ADS(node, reg_addr, _flags, defvalue)		\
+	do {								\
+		u32 __count = node->number_of_registers;		\
+		if (WARN_ON(__count >= GUC_REGSET_MAX_REGISTERS))	\
+			continue;					\
+		node->registers[__count].offset = reg_addr.reg;		\
+		node->registers[__count].flags = (_flags);		\
+		if (defvalue)						\
+			node->registers[__count].value = (defvalue);	\
+		node->number_of_registers++;				\
+	} while (0)
+
 static int guc_ads_create(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -1030,6 +1048,7 @@ static int guc_ads_create(struct intel_guc *guc)
 		u8 reg_state_buffer[GUC_S3_SAVE_SPACE_PAGES * PAGE_SIZE];
 	} __packed *blob;
 	struct intel_engine_cs *engine;
+	struct i915_workarounds *workarounds = &dev_priv->workarounds;
 	enum intel_engine_id id;
 	u32 base;
 
@@ -1049,6 +1068,49 @@ static int guc_ads_create(struct intel_guc *guc)
 
 	/* MMIO reg state */
 	for_each_engine(engine, dev_priv, id) {
+		u32 i;
+		struct guc_mmio_regset *eng_reg =
+			&blob->reg_state.engine_reg[engine->guc_id];
+
+		/*
+		 * Provide a list of registers to be saved/restored during gpu
+		 * reset. This is mainly required for Media reset (aka watchdog
+		 * timeout) which is completely under the control of GuC
+		 * (resubmission of hung workload is handled inside GuC).
+		 */
+		GUC_ADD_MMIO_REG_ADS(eng_reg, RING_HWS_PGA(engine->mmio_base),
+				     GUC_REGSET_ENGINERESET |
+				     GUC_REGSET_SAVE_CURRENT_VALUE, 0);
+
+		/*
+		 * Workaround the guc issue with masked registers, note that
+		 * at this point guc submission is still disabled and the mode
+		 * register doesnt have the irq_steering bit set, which we
+		 * need to fwd irqs to GuC.
+		 */
+		GUC_ADD_MMIO_REG_ADS(eng_reg, RING_MODE_GEN7(engine),
+				     GUC_REGSET_ENGINERESET |
+				     GUC_REGSET_SAVE_DEFAULT_VALUE,
+				     I915_READ(RING_MODE_GEN7(engine)) |
+				     GFX_INTERRUPT_STEERING | (0xFFFF<<16));
+
+		GUC_ADD_MMIO_REG_ADS(eng_reg, RING_IMR(engine->mmio_base),
+				     GUC_REGSET_ENGINERESET |
+				     GUC_REGSET_SAVE_CURRENT_VALUE, 0);
+
+		/* ask guc to re-apply workarounds set in *_init_workarounds */
+		if (engine->id == RCS) {
+			for (i = 0; i < workarounds->guc_count; i++)
+				GUC_ADD_MMIO_REG_ADS(eng_reg,
+						     workarounds->guc_reg[i].addr,
+						     GUC_REGSET_ENGINERESET |
+						     GUC_REGSET_SAVE_DEFAULT_VALUE,
+						     workarounds->guc_reg[i].value);
+		}
+
+		DRM_DEBUG_DRIVER("%s register save/restore count: %u\n",
+				 engine->name, eng_reg->number_of_registers);
+
 		blob->reg_state.white_list[engine->guc_id].mmio_start =
 			i915_mmio_reg_offset(RING_FORCE_TO_NONPRIV(engine->mmio_base, 0));
 
@@ -1058,9 +1120,13 @@ static int guc_ads_create(struct intel_guc *guc)
 		 * inconsistencies with the handling of FORCE_TO_NONPRIV
 		 * registers.
 		 */
-		blob->reg_state.white_list[engine->guc_id].count = 0;
+		blob->reg_state.white_list[engine->guc_id].count =
+					workarounds->hw_whitelist_count[id];
 
-		/* Nothing to be saved or restored for now. */
+		for (i = 0; i < workarounds->hw_whitelist_count[id]; i++) {
+			blob->reg_state.white_list[engine->guc_id].offsets[i] =
+				I915_READ(RING_FORCE_TO_NONPRIV(engine->mmio_base, i));
+		}
 	}
 
 	/*
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 413bfd8d4bf4..ce005164d71c 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -708,6 +708,29 @@ static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
 	return 0;
 }
 
+static int guc_wa_add(struct drm_i915_private *dev_priv,
+		      i915_reg_t addr, const u32 val)
+{
+	const u32 idx = dev_priv->workarounds.guc_count;
+
+	I915_WRITE(addr, val);
+	if (WARN_ON(idx >= GUC_REGSET_MAX_REGISTERS))
+		return -ENOSPC;
+
+	dev_priv->workarounds.guc_reg[idx].addr = addr;
+	/* GuC can't handle masked regs, so we store the value|mask together */
+	dev_priv->workarounds.guc_reg[idx].value = val;
+	dev_priv->workarounds.guc_count++;
+
+	return 0;
+}
+
+#define WA_REG_WR_GUC_RESTORE(addr, val) do { \
+		const int r = guc_wa_add(dev_priv, (addr), (val)); \
+		if (r) \
+			return r; \
+	} while (0)
+
 static int gen8_init_workarounds(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
@@ -815,15 +838,16 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 	int ret;
 
 	/* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk */
-	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+	WA_REG_WR_GUC_RESTORE(GEN9_CSFE_CHICKEN1_RCS,
+			      _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
 
 	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk */
-	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
-		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+	WA_REG_WR_GUC_RESTORE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
+			      GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
 
 	/* WaDisableKillLogic:bxt,skl,kbl */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-		   ECOCHK_DIS_TLB);
+	WA_REG_WR_GUC_RESTORE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+			      ECOCHK_DIS_TLB);
 
 	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk */
 	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk */
@@ -894,8 +918,8 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 			  HDC_FORCE_NON_COHERENT);
 
 	/* WaDisableHDCInvalidation:skl,bxt,kbl */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-		   BDW_DISABLE_HDC_INVALIDATION);
+	WA_REG_WR_GUC_RESTORE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+			      BDW_DISABLE_HDC_INVALIDATION);
 
 	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl */
 	if (IS_SKYLAKE(dev_priv) ||
@@ -908,8 +932,8 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
 
 	/* WaOCLCoherentLineFlush:skl,bxt,kbl */
-	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
-				    GEN8_LQSC_FLUSH_COHERENT_LINES));
+	WA_REG_WR_GUC_RESTORE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
+					       GEN8_LQSC_FLUSH_COHERENT_LINES));
 
 	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk */
 	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
@@ -984,12 +1008,12 @@ static int skl_init_workarounds(struct intel_engine_cs *engine)
 	 * until D0 which is the default case so this is equivalent to
 	 * !WaDisablePerCtxtPreemptionGranularityControl:skl
 	 */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+	WA_REG_WR_GUC_RESTORE(GEN7_FF_SLICE_CS_CHICKEN1,
+			      _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
 
 	/* WaEnableGapsTsvCreditFix:skl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+	WA_REG_WR_GUC_RESTORE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+					      GEN9_GAPS_TSV_CREDIT_DISABLE));
 
 	/* WaDisableGafsUnitClkGating:skl */
 	WA_SET_BIT(GEN7_UCGCTL4, GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
@@ -1019,12 +1043,12 @@ static int bxt_init_workarounds(struct intel_engine_cs *engine)
 	/* WaStoreMultiplePTEenable:bxt */
 	/* This is a requirement according to Hardware specification */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+		WA_REG_WR_GUC_RESTORE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
 
 	/* WaSetClckGatingDisableMedia:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
-					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+		WA_REG_WR_GUC_RESTORE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
+						       ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
 	}
 
 	/* WaDisableThreadStallDopClockGating:bxt */
@@ -1060,8 +1084,8 @@ static int bxt_init_workarounds(struct intel_engine_cs *engine)
 
 	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
 	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
-		I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
-					   L3_HIGH_PRIO_CREDITS(2));
+		WA_REG_WR_GUC_RESTORE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
+						      L3_HIGH_PRIO_CREDITS(2));
 
 	/* WaToEnableHwFixForPushConstHWBug:bxt */
 	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
@@ -1086,8 +1110,8 @@ static int kbl_init_workarounds(struct intel_engine_cs *engine)
 		return ret;
 
 	/* WaEnableGapsTsvCreditFix:kbl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+	WA_REG_WR_GUC_RESTORE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+					      GEN9_GAPS_TSV_CREDIT_DISABLE));
 
 	/* WaDisableDynamicCreditSharing:kbl */
 	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
@@ -1148,6 +1172,7 @@ int init_workarounds_ring(struct intel_engine_cs *engine)
 	WARN_ON(engine->id != RCS);
 
 	dev_priv->workarounds.count = 0;
+	dev_priv->workarounds.guc_count = 0;
 	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
 
 	if (IS_BROADWELL(dev_priv))
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2017-05-22 17:46 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-22 17:46 [PATCH v8 00/20] Gen8+ engine-reset Michel Thierry
2017-05-22 17:46 ` [PATCH v8 01/20] drm/i915: Look for active requests earlier in the reset path Michel Thierry
2017-05-22 17:46 ` [PATCH v8 02/20] drm/i915: Update i915.reset to handle engine resets Michel Thierry
2017-05-22 17:46 ` [PATCH v8 03/20] drm/i915: Modify error handler for per engine hang recovery Michel Thierry
2017-06-02 20:16   ` Chris Wilson
2017-06-02 20:38     ` Michel Thierry
2017-06-02 21:29       ` Chris Wilson
2017-06-04 11:47     ` Chris Wilson
2017-06-04 12:06       ` Chris Wilson
2017-06-06  0:40         ` Michel Thierry
2017-06-06 10:16           ` Chris Wilson
2017-06-06 18:51             ` Michel Thierry
2017-06-06  0:45   ` [PATCH v9] " Michel Thierry
2017-06-06 13:03     ` kbuild test robot
2017-06-06 13:05     ` kbuild test robot
2017-05-22 17:46 ` [PATCH v8 04/20] drm/i915: Add support for per engine reset recovery Michel Thierry
2017-05-22 17:46 ` [PATCH v8 05/20] drm/i915: Add engine reset count to error state Michel Thierry
2017-05-22 17:46 ` [PATCH v8 06/20] drm/i915: Export per-engine reset count info to debugfs Michel Thierry
2017-05-22 17:46 ` [PATCH v8 07/20] drm/i915: Carry on with reset even if hw engine is not ready Michel Thierry
2017-05-22 17:46 ` [PATCH v8 08/20] drm/i915: Enable Engine reset and recovery support Michel Thierry
2017-05-22 17:46 ` [PATCH v8 09/20] drm/i915: Add engine reset count in get-reset-stats ioctl Michel Thierry
2017-05-22 17:46 ` [PATCH v8 10/20] drm/i915/selftests: reset engine self tests Michel Thierry
2017-05-22 17:46 ` [PATCH v8 11/20] drm/i915/guc: fix mmio whitelist mmio_start offset and add reminder Michel Thierry
2017-05-22 17:46 ` Michel Thierry [this message]
2017-05-22 17:46 ` [PATCH v8 13/20] drm/i915/guc: Rename the function that resets the GuC Michel Thierry
2017-05-22 17:46 ` [PATCH v8 14/20] drm/i915/guc: Add support for reset engine using GuC commands Michel Thierry
2017-05-22 17:46 ` [PATCH v8 15/20] drm/i915: Watchdog timeout: Pass GuC shared data structure during param load Michel Thierry
2017-05-22 17:46 ` [PATCH v8 16/20] drm/i915: Watchdog timeout: IRQ handler for gen8+ Michel Thierry
2017-05-22 17:46 ` [PATCH v8 17/20] drm/i915: Watchdog timeout: Ringbuffer command emission " Michel Thierry
2017-05-22 17:46 ` [PATCH v8 18/20] drm/i915: Watchdog timeout: DRM kernel interface to set the timeout Michel Thierry
2017-05-22 17:46 ` [PATCH v8 19/20] drm/i915: Watchdog timeout: Include threshold value in error state Michel Thierry
2017-05-22 17:46 ` [PATCH v8 20/20] drm/i915: Watchdog timeout: Export media reset count from GuC to debugfs Michel Thierry
2017-05-22 18:04 ` ✓ Fi.CI.BAT: success for Gen8+ engine-reset (rev11) Patchwork
2017-06-06  1:02 ` ✗ Fi.CI.BAT: failure for Gen8+ engine-reset (rev12) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170522174641.25354-13-michel.thierry@intel.com \
    --to=michel.thierry@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.