dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] i915: dedicated MCR locking and hardware semaphore
@ 2022-11-28 23:30 Matt Roper
  2022-11-28 23:30 ` [PATCH v2 1/5] drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg() Matt Roper
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Matt Roper @ 2022-11-28 23:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala, Balasubramani Vivekanandan, dri-devel

We've been overloading uncore->lock to protect access to the MCR
steering register.  That's not really what uncore->lock is intended for,
and it would be better if we didn't need to hold such a high-traffic
spinlock for the whole sequence of (apply steering, access MCR register,
restore steering).  Switch to a dedicated MCR lock to protect the
steering control register over this critical section and stop relying on
the high-traffic uncore->lock.  On pre-MTL platforms the dedicated MCR
lock is just another software lock, but on MTL and beyond we also
utilize the hardware-provided STEER_SEMAPHORE that allows us to
synchronize with external hardware and firmware agents.

v2:
 - Use irqsave/irqrestore locking; on platforms that use execlist
   submission instead of GuC, MCR accesses can happen in interrupt
   context (tasklet) during reset -> error dump.
 - Extend timeout for hardware semaphore and CI taint if we ever
   encounter it (this implies a hardware/firmware problem).  (Mika)
 - Add an extra patch optimizing xehp_setup_private_ppat by holding
   forcewake & mcr lock over the sequence of register writes.  (Bala)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Matt Roper (5):
  drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg()
  drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes
  drm/i915/gt: Add dedicated MCR lock
  drm/i915/mtl: Add hardware-level lock for steering
  drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup

 drivers/gpu/drm/i915/gt/intel_gt.c          |   7 +-
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 129 ++++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h      |   2 +
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |   1 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h    |   8 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c         |  27 ++--
 drivers/gpu/drm/i915/gt/intel_mocs.c        |   3 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c |  12 +-
 8 files changed, 162 insertions(+), 27 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/5] drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg()
  2022-11-28 23:30 [PATCH v2 0/5] i915: dedicated MCR locking and hardware semaphore Matt Roper
@ 2022-11-28 23:30 ` Matt Roper
  2022-11-30 15:42   ` Balasubramani Vivekanandan
  2022-11-28 23:30 ` [PATCH v2 2/5] drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes Matt Roper
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Matt Roper @ 2022-11-28 23:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: kernel test robot, dri-devel

The kerneldoc function name was not updated when this function was
converted to a non-fw form.

Fixes: 192bb40f030a ("drm/i915/gt: Manage uncore->lock while waiting on MCR register")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index d9a8ff9e5e57..ea86c1ab5dc5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -702,7 +702,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
 }
 
 /**
- * intel_gt_mcr_wait_for_reg_fw - wait until MCR register matches expected state
+ * intel_gt_mcr_wait_for_reg - wait until MCR register matches expected state
  * @gt: GT structure
  * @reg: the register to read
  * @mask: mask to apply to register value
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/5] drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes
  2022-11-28 23:30 [PATCH v2 0/5] i915: dedicated MCR locking and hardware semaphore Matt Roper
  2022-11-28 23:30 ` [PATCH v2 1/5] drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg() Matt Roper
@ 2022-11-28 23:30 ` Matt Roper
  2022-11-30 15:43   ` Balasubramani Vivekanandan
  2022-11-28 23:30 ` [PATCH v2 3/5] drm/i915/gt: Add dedicated MCR lock Matt Roper
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Matt Roper @ 2022-11-28 23:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Passing the GT rather than uncore to the lowest level MCR read and write
functions will make it easier to introduce dedicated MCR locking in a
following patch.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index ea86c1ab5dc5..f4484bb18ec9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -221,7 +221,7 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
 
 /*
  * rw_with_mcr_steering_fw - Access a register with specific MCR steering
- * @uncore: pointer to struct intel_uncore
+ * @gt: GT to read register from
  * @reg: register being accessed
  * @rw_flag: FW_REG_READ for read access or FW_REG_WRITE for write access
  * @group: group number (documented as "sliceid" on older platforms)
@@ -232,10 +232,11 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
  *
  * Caller needs to make sure the relevant forcewake wells are up.
  */
-static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
+static u32 rw_with_mcr_steering_fw(struct intel_gt *gt,
 				   i915_mcr_reg_t reg, u8 rw_flag,
 				   int group, int instance, u32 value)
 {
+	struct intel_uncore *uncore = gt->uncore;
 	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
 
 	lockdep_assert_held(&uncore->lock);
@@ -308,11 +309,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 	return val;
 }
 
-static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
+static u32 rw_with_mcr_steering(struct intel_gt *gt,
 				i915_mcr_reg_t reg, u8 rw_flag,
 				int group, int instance,
 				u32 value)
 {
+	struct intel_uncore *uncore = gt->uncore;
 	enum forcewake_domains fw_domains;
 	u32 val;
 
@@ -325,7 +327,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
 	spin_lock_irq(&uncore->lock);
 	intel_uncore_forcewake_get__locked(uncore, fw_domains);
 
-	val = rw_with_mcr_steering_fw(uncore, reg, rw_flag, group, instance, value);
+	val = rw_with_mcr_steering_fw(gt, reg, rw_flag, group, instance, value);
 
 	intel_uncore_forcewake_put__locked(uncore, fw_domains);
 	spin_unlock_irq(&uncore->lock);
@@ -347,7 +349,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
 		      i915_mcr_reg_t reg,
 		      int group, int instance)
 {
-	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0);
+	return rw_with_mcr_steering(gt, reg, FW_REG_READ, group, instance, 0);
 }
 
 /**
@@ -364,7 +366,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
 void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value,
 				int group, int instance)
 {
-	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value);
+	rw_with_mcr_steering(gt, reg, FW_REG_WRITE, group, instance, value);
 }
 
 /**
@@ -588,7 +590,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
 	for (type = 0; type < NUM_STEERING_TYPES; type++) {
 		if (reg_needs_read_steering(gt, reg, type)) {
 			get_nonterminated_steering(gt, type, &group, &instance);
-			return rw_with_mcr_steering_fw(gt->uncore, reg,
+			return rw_with_mcr_steering_fw(gt, reg,
 						       FW_REG_READ,
 						       group, instance, 0);
 		}
@@ -615,7 +617,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
 	for (type = 0; type < NUM_STEERING_TYPES; type++) {
 		if (reg_needs_read_steering(gt, reg, type)) {
 			get_nonterminated_steering(gt, type, &group, &instance);
-			return rw_with_mcr_steering(gt->uncore, reg,
+			return rw_with_mcr_steering(gt, reg,
 						    FW_REG_READ,
 						    group, instance, 0);
 		}
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/5] drm/i915/gt: Add dedicated MCR lock
  2022-11-28 23:30 [PATCH v2 0/5] i915: dedicated MCR locking and hardware semaphore Matt Roper
  2022-11-28 23:30 ` [PATCH v2 1/5] drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg() Matt Roper
  2022-11-28 23:30 ` [PATCH v2 2/5] drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes Matt Roper
@ 2022-11-28 23:30 ` Matt Roper
  2022-11-30 15:45   ` Balasubramani Vivekanandan
  2022-11-28 23:30 ` [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering Matt Roper
  2022-11-28 23:30 ` [PATCH v2 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup Matt Roper
  4 siblings, 1 reply; 18+ messages in thread
From: Matt Roper @ 2022-11-28 23:30 UTC (permalink / raw)
  To: intel-gfx
  Cc: Chris Wilson, Balasubramani Vivekanandan, dri-devel, Mika Kuoppala

We've been overloading uncore->lock to protect access to the MCR
steering register.  That's not really what uncore->lock is intended for,
and it would be better if we didn't need to hold such a high-traffic
spinlock for the whole sequence of (apply steering, access MCR register,
restore steering).  Let's create a dedicated MCR lock to protect the
steering control register over this critical section and stop relying on
the high-traffic uncore->lock.

For now the new lock is a software lock.  However some platforms (MTL
and beyond) have a hardware-provided locking mechanism that can be used
to serialize not only software accesses, but also hardware/firmware
accesses as well; support for that hardware level lock will be added in
a future patch.

v2:
 - Use irqsave/irqrestore spinlock calls; platforms using execlist
   submission rather than GuC submission can perform MCR accesses in
   interrupt context because reset -> errordump happens in a tasklet.

Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c          |  7 +-
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 79 +++++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h      |  2 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h    |  8 +++
 drivers/gpu/drm/i915/gt/intel_mocs.c        |  3 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 12 ++--
 6 files changed, 101 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 7ef0edb2e37c..6847f3bd2b03 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1079,6 +1079,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	enum intel_engine_id id;
 	const i915_reg_t *regs;
 	unsigned int num = 0;
+	unsigned long flags;
 
 	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
 		regs = NULL;
@@ -1099,7 +1100,8 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 
 	intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
 
-	spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */
+	intel_gt_mcr_lock(gt, &flags);
+	spin_lock(&uncore->lock); /* serialise invalidate with GT reset */
 
 	awake = 0;
 	for_each_engine(engine, gt, id) {
@@ -1133,7 +1135,8 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	     IS_ALDERLAKE_P(i915)))
 		intel_uncore_write_fw(uncore, GEN12_OA_TLB_INV_CR, 1);
 
-	spin_unlock_irq(&uncore->lock);
+	spin_unlock(&uncore->lock);
+	intel_gt_mcr_unlock(gt, flags);
 
 	for_each_engine_masked(engine, gt, awake, tmp) {
 		struct reg_and_bit rb;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index f4484bb18ec9..aa070ae57f11 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -143,6 +143,8 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 	unsigned long fuse;
 	int i;
 
+	spin_lock_init(&gt->mcr_lock);
+
 	/*
 	 * An mslice is unavailable only if both the meml3 for the slice is
 	 * disabled *and* all of the DSS in the slice (quadrant) are disabled.
@@ -228,6 +230,7 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
  * @instance: instance number (documented as "subsliceid" on older platforms)
  * @value: register value to be written (ignored for read)
  *
+ * Context: The caller must hold the MCR lock
  * Return: 0 for write access. register value for read access.
  *
  * Caller needs to make sure the relevant forcewake wells are up.
@@ -239,7 +242,7 @@ static u32 rw_with_mcr_steering_fw(struct intel_gt *gt,
 	struct intel_uncore *uncore = gt->uncore;
 	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
 
-	lockdep_assert_held(&uncore->lock);
+	lockdep_assert_held(&gt->mcr_lock);
 
 	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
 		/*
@@ -316,6 +319,7 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
 {
 	struct intel_uncore *uncore = gt->uncore;
 	enum forcewake_domains fw_domains;
+	unsigned long flags;
 	u32 val;
 
 	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
@@ -324,17 +328,59 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
 						     GEN8_MCR_SELECTOR,
 						     FW_REG_READ | FW_REG_WRITE);
 
-	spin_lock_irq(&uncore->lock);
+	intel_gt_mcr_lock(gt, &flags);
+	spin_lock(&uncore->lock);
 	intel_uncore_forcewake_get__locked(uncore, fw_domains);
 
 	val = rw_with_mcr_steering_fw(gt, reg, rw_flag, group, instance, value);
 
 	intel_uncore_forcewake_put__locked(uncore, fw_domains);
-	spin_unlock_irq(&uncore->lock);
+	spin_unlock(&uncore->lock);
+	intel_gt_mcr_unlock(gt, flags);
 
 	return val;
 }
 
+/**
+ * intel_gt_mcr_lock - Acquire MCR steering lock
+ * @gt: GT structure
+ * @flags: storage to save IRQ flags to
+ *
+ * Performs locking to protect the steering for the duration of an MCR
+ * operation.  Depending on the platform, this may be a software lock
+ * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
+ * access not only for the driver, but also for external hardware and
+ * firmware agents).
+ *
+ * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
+ *          function is called, although it may be acquired after this
+ *          function call.
+ */
+void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
+{
+	unsigned long __flags;
+
+	lockdep_assert_not_held(&gt->uncore->lock);
+
+	spin_lock_irqsave(&gt->mcr_lock, __flags);
+
+	*flags = __flags;
+}
+
+/**
+ * intel_gt_mcr_unlock - Release MCR steering lock
+ * @gt: GT structure
+ * @flags: IRQ flags to restore
+ *
+ * Releases the lock acquired by intel_gt_mcr_lock().
+ *
+ * Context: Releases gt->mcr_lock
+ */
+void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags)
+{
+	spin_unlock_irqrestore(&gt->mcr_lock, flags);
+}
+
 /**
  * intel_gt_mcr_read - read a specific instance of an MCR register
  * @gt: GT structure
@@ -342,6 +388,8 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
  * @group: the MCR group
  * @instance: the MCR instance
  *
+ * Context: Takes and releases gt->mcr_lock
+ *
  * Returns the value read from an MCR register after steering toward a specific
  * group/instance.
  */
@@ -362,6 +410,8 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
  *
  * Write an MCR register in unicast mode after steering toward a specific
  * group/instance.
+ *
+ * Context: Calls a function that takes and releases gt->mcr_lock
  */
 void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value,
 				int group, int instance)
@@ -376,10 +426,16 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 val
  * @value: value to write
  *
  * Write an MCR register in multicast mode to update all instances.
+ *
+ * Context: Takes and releases gt->mcr_lock
  */
 void intel_gt_mcr_multicast_write(struct intel_gt *gt,
 				  i915_mcr_reg_t reg, u32 value)
 {
+	unsigned long flags;
+
+	intel_gt_mcr_lock(gt, &flags);
+
 	/*
 	 * Ensure we have multicast behavior, just in case some non-i915 agent
 	 * left the hardware in unicast mode.
@@ -388,6 +444,8 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
 		intel_uncore_write_fw(gt->uncore, MTL_MCR_SELECTOR, GEN11_MCR_MULTICAST);
 
 	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
+
+	intel_gt_mcr_unlock(gt, flags);
 }
 
 /**
@@ -400,9 +458,13 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
  * function assumes the caller is already holding any necessary forcewake
  * domains; use intel_gt_mcr_multicast_write() in cases where forcewake should
  * be obtained automatically.
+ *
+ * Context: The caller must hold gt->mcr_lock.
  */
 void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value)
 {
+	lockdep_assert_held(&gt->mcr_lock);
+
 	/*
 	 * Ensure we have multicast behavior, just in case some non-i915 agent
 	 * left the hardware in unicast mode.
@@ -429,6 +491,8 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u3
  * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should
  * be obtained automatically.
  *
+ * Context: Calls functions that take and release gt->mcr_lock
+ *
  * Returns the old (unmodified) value read.
  */
 u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
@@ -580,6 +644,8 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
  * domains; use intel_gt_mcr_read_any() in cases where forcewake should be
  * obtained automatically.
  *
+ * Context: The caller must hold gt->mcr_lock.
+ *
  * Returns the value from a non-terminated instance of @reg.
  */
 u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
@@ -587,6 +653,8 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
 	int type;
 	u8 group, instance;
 
+	lockdep_assert_held(&gt->mcr_lock);
+
 	for (type = 0; type < NUM_STEERING_TYPES; type++) {
 		if (reg_needs_read_steering(gt, reg, type)) {
 			get_nonterminated_steering(gt, type, &group, &instance);
@@ -607,6 +675,8 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
  * Reads a GT MCR register.  The read will be steered to a non-terminated
  * instance (i.e., one that isn't fused off or powered down by power gating).
  *
+ * Context: Calls a function that takes and releases gt->mcr_lock.
+ *
  * Returns the value from a non-terminated instance of @reg.
  */
 u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
@@ -730,6 +800,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
  * Note that this routine assumes the caller holds forcewake asserted, it is
  * not suitable for very long waits.
  *
+ * Context: Calls a function that takes and releases gt->mcr_lock
  * Return: 0 if the register matches the desired condition, or -ETIMEDOUT.
  */
 int intel_gt_mcr_wait_for_reg(struct intel_gt *gt,
@@ -741,7 +812,7 @@ int intel_gt_mcr_wait_for_reg(struct intel_gt *gt,
 {
 	int ret;
 
-	lockdep_assert_not_held(&gt->uncore->lock);
+	lockdep_assert_not_held(&gt->mcr_lock);
 
 #define done ((intel_gt_mcr_read_any(gt, reg) & mask) == value)
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index ae93b20e1c17..41684495b7da 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -9,6 +9,8 @@
 #include "intel_gt_types.h"
 
 void intel_gt_mcr_init(struct intel_gt *gt);
+void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags);
+void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags);
 
 u32 intel_gt_mcr_read(struct intel_gt *gt,
 		      i915_mcr_reg_t reg,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index c1d9cd255e06..76c34c4af867 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -233,6 +233,14 @@ struct intel_gt {
 		u8 instanceid;
 	} default_steering;
 
+	/**
+	 * @mcr_lock: Protects the MCR steering register
+	 *
+	 * Protects the MCR steering register (e.g., GEN8_MCR_SELECTOR).
+	 * Should be taken before uncore->lock in cases where both are desired.
+	 */
+	spinlock_t mcr_lock;
+
 	/*
 	 * Base of per-tile GTTMMADR where we can derive the MMIO and the GGTT.
 	 */
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 49fdd509527a..69b489e8dfed 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -613,14 +613,17 @@ static u32 l3cc_combine(u16 low, u16 high)
 static void init_l3cc_table(struct intel_gt *gt,
 			    const struct drm_i915_mocs_table *table)
 {
+	unsigned long flags;
 	unsigned int i;
 	u32 l3cc;
 
+	intel_gt_mcr_lock(gt, &flags);
 	for_each_l3cc(l3cc, table, i)
 		if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
 			intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc);
 		else
 			intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc);
+	intel_gt_mcr_unlock(gt, flags);
 }
 
 void intel_mocs_init_engine(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 1b0e40e68a9d..3e35facac2b4 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1760,7 +1760,8 @@ static void wa_list_apply(const struct i915_wa_list *wal)
 
 	fw = wal_get_fw_for_rmw(uncore, wal);
 
-	spin_lock_irqsave(&uncore->lock, flags);
+	intel_gt_mcr_lock(gt, &flags);
+	spin_lock(&uncore->lock);
 	intel_uncore_forcewake_get__locked(uncore, fw);
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
@@ -1789,7 +1790,8 @@ static void wa_list_apply(const struct i915_wa_list *wal)
 	}
 
 	intel_uncore_forcewake_put__locked(uncore, fw);
-	spin_unlock_irqrestore(&uncore->lock, flags);
+	spin_unlock(&uncore->lock);
+	intel_gt_mcr_unlock(gt, flags);
 }
 
 void intel_gt_apply_workarounds(struct intel_gt *gt)
@@ -1810,7 +1812,8 @@ static bool wa_list_verify(struct intel_gt *gt,
 
 	fw = wal_get_fw_for_rmw(uncore, wal);
 
-	spin_lock_irqsave(&uncore->lock, flags);
+	intel_gt_mcr_lock(gt, &flags);
+	spin_lock(&uncore->lock);
 	intel_uncore_forcewake_get__locked(uncore, fw);
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
@@ -1820,7 +1823,8 @@ static bool wa_list_verify(struct intel_gt *gt,
 				wal->name, from);
 
 	intel_uncore_forcewake_put__locked(uncore, fw);
-	spin_unlock_irqrestore(&uncore->lock, flags);
+	spin_unlock(&uncore->lock);
+	intel_gt_mcr_unlock(gt, flags);
 
 	return ok;
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering
  2022-11-28 23:30 [PATCH v2 0/5] i915: dedicated MCR locking and hardware semaphore Matt Roper
                   ` (2 preceding siblings ...)
  2022-11-28 23:30 ` [PATCH v2 3/5] drm/i915/gt: Add dedicated MCR lock Matt Roper
@ 2022-11-28 23:30 ` Matt Roper
  2022-12-02 14:46   ` [Intel-gfx] " Balasubramani Vivekanandan
  2022-12-05  8:58   ` Tvrtko Ursulin
  2022-11-28 23:30 ` [PATCH v2 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup Matt Roper
  4 siblings, 2 replies; 18+ messages in thread
From: Matt Roper @ 2022-11-28 23:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala, dri-devel

Starting with MTL, the driver needs to not only protect the steering
control register from simultaneous software accesses, but also protect
against races with hardware/firmware agents.  The hardware provides a
dedicated locking mechanism to support this via the MTL_STEER_SEMAPHORE
register.  Reading the register acts as a 'trylock' operation; the read
will return 0x1 if the lock is acquired or 0x0 if something else is
already holding the lock; once acquired, writing 0x1 to the register
will release the lock.

We'll continue to grab the software lock as well, just so lockdep can
track our locking; assuming the hardware lock is behaving properly,
there should never be any contention on the software lock in this case.

v2:
 - Extend hardware semaphore timeout and add a taint for CI if it ever
   happens (this would imply misbehaving hardware/firmware).  (Mika)
 - Add "MTL_" prefix to new steering semaphore register.  (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c  | 38 ++++++++++++++++++++++---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |  1 +
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index aa070ae57f11..087e4ac5b68d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -347,10 +347,9 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
  * @flags: storage to save IRQ flags to
  *
  * Performs locking to protect the steering for the duration of an MCR
- * operation.  Depending on the platform, this may be a software lock
- * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
- * access not only for the driver, but also for external hardware and
- * firmware agents).
+ * operation.  On MTL and beyond, a hardware lock will also be taken to
+ * serialize access not only for the driver, but also for external hardware and
+ * firmware agents.
  *
  * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
  *          function is called, although it may be acquired after this
@@ -359,12 +358,40 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
 void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
 {
 	unsigned long __flags;
+	int err = 0;
 
 	lockdep_assert_not_held(&gt->uncore->lock);
 
+	/*
+	 * Starting with MTL, we need to coordinate not only with other
+	 * driver threads, but also with hardware/firmware agents.  A dedicated
+	 * locking register is used.
+	 */
+	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
+		err = wait_for(intel_uncore_read_fw(gt->uncore,
+						    MTL_STEER_SEMAPHORE) == 0x1, 100);
+
+	/*
+	 * Even on platforms with a hardware lock, we'll continue to grab
+	 * a software spinlock too for lockdep purposes.  If the hardware lock
+	 * was already acquired, there should never be contention on the
+	 * software lock.
+	 */
 	spin_lock_irqsave(&gt->mcr_lock, __flags);
 
 	*flags = __flags;
+
+	/*
+	 * In theory we should never fail to acquire the HW semaphore; this
+	 * would indicate some hardware/firmware is misbehaving and not
+	 * releasing it properly.
+	 */
+	if (err == -ETIMEDOUT) {
+		drm_err_ratelimited(&gt->i915->drm,
+				    "GT%u hardware MCR steering semaphore timed out",
+				    gt->info.id);
+		add_taint_for_CI(gt->i915, TAINT_WARN);  /* CI is now unreliable */
+	}
 }
 
 /**
@@ -379,6 +406,9 @@ void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
 void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags)
 {
 	spin_unlock_irqrestore(&gt->mcr_lock, flags);
+
+	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
+		intel_uncore_write_fw(gt->uncore, MTL_STEER_SEMAPHORE, 0x1);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 784152548472..1618d46cb8c7 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -67,6 +67,7 @@
 #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
 
 #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
+#define MTL_STEER_SEMAPHORE			_MMIO(0xfd0)
 #define MTL_MCR_SELECTOR			_MMIO(0xfd4)
 #define SF_MCR_SELECTOR				_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup
  2022-11-28 23:30 [PATCH v2 0/5] i915: dedicated MCR locking and hardware semaphore Matt Roper
                   ` (3 preceding siblings ...)
  2022-11-28 23:30 ` [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering Matt Roper
@ 2022-11-28 23:30 ` Matt Roper
  2022-11-30 15:51   ` Balasubramani Vivekanandan
  4 siblings, 1 reply; 18+ messages in thread
From: Matt Roper @ 2022-11-28 23:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Balasubramani Vivekanandan, dri-devel

PPAT setup involves a series of multicast writes.  This can be optimized
slightly be acquiring forcewake and the steering lock just once for the
entire sequence.

Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2ba3983984b9..288d9f118ee9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 
 static void xehp_setup_private_ppat(struct intel_gt *gt)
 {
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
+	enum forcewake_domains fw;
+	unsigned long flags;
+
+	fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg),
+					    FW_REG_READ);
+	intel_uncore_forcewake_get(gt->uncore, fw);
+
+	intel_gt_mcr_lock(gt, &flags);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
+	intel_gt_mcr_unlock(gt, flags);
+
+	intel_uncore_forcewake_put(gt->uncore, fw);
 }
 
 static void icl_setup_private_ppat(struct intel_uncore *uncore)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/5] drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg()
  2022-11-28 23:30 ` [PATCH v2 1/5] drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg() Matt Roper
@ 2022-11-30 15:42   ` Balasubramani Vivekanandan
  0 siblings, 0 replies; 18+ messages in thread
From: Balasubramani Vivekanandan @ 2022-11-30 15:42 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: kernel test robot, dri-devel

On 28.11.2022 15:30, Matt Roper wrote:
> The kerneldoc function name was not updated when this function was
> converted to a non-fw form.
> 
> Fixes: 192bb40f030a ("drm/i915/gt: Manage uncore->lock while waiting on MCR register")
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Regards,
Bala

> ---
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index d9a8ff9e5e57..ea86c1ab5dc5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -702,7 +702,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
>  }
>  
>  /**
> - * intel_gt_mcr_wait_for_reg_fw - wait until MCR register matches expected state
> + * intel_gt_mcr_wait_for_reg - wait until MCR register matches expected state
>   * @gt: GT structure
>   * @reg: the register to read
>   * @mask: mask to apply to register value
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/5] drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes
  2022-11-28 23:30 ` [PATCH v2 2/5] drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes Matt Roper
@ 2022-11-30 15:43   ` Balasubramani Vivekanandan
  0 siblings, 0 replies; 18+ messages in thread
From: Balasubramani Vivekanandan @ 2022-11-30 15:43 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: dri-devel

On 28.11.2022 15:30, Matt Roper wrote:
> Passing the GT rather than uncore to the lowest level MCR read and write
> functions will make it easier to introduce dedicated MCR locking in a
> following patch.
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Regards,
Bala

> ---
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index ea86c1ab5dc5..f4484bb18ec9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -221,7 +221,7 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
>  
>  /*
>   * rw_with_mcr_steering_fw - Access a register with specific MCR steering
> - * @uncore: pointer to struct intel_uncore
> + * @gt: GT to read register from
>   * @reg: register being accessed
>   * @rw_flag: FW_REG_READ for read access or FW_REG_WRITE for write access
>   * @group: group number (documented as "sliceid" on older platforms)
> @@ -232,10 +232,11 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
>   *
>   * Caller needs to make sure the relevant forcewake wells are up.
>   */
> -static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> +static u32 rw_with_mcr_steering_fw(struct intel_gt *gt,
>  				   i915_mcr_reg_t reg, u8 rw_flag,
>  				   int group, int instance, u32 value)
>  {
> +	struct intel_uncore *uncore = gt->uncore;
>  	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
>  
>  	lockdep_assert_held(&uncore->lock);
> @@ -308,11 +309,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  	return val;
>  }
>  
> -static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
> +static u32 rw_with_mcr_steering(struct intel_gt *gt,
>  				i915_mcr_reg_t reg, u8 rw_flag,
>  				int group, int instance,
>  				u32 value)
>  {
> +	struct intel_uncore *uncore = gt->uncore;
>  	enum forcewake_domains fw_domains;
>  	u32 val;
>  
> @@ -325,7 +327,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
>  	spin_lock_irq(&uncore->lock);
>  	intel_uncore_forcewake_get__locked(uncore, fw_domains);
>  
> -	val = rw_with_mcr_steering_fw(uncore, reg, rw_flag, group, instance, value);
> +	val = rw_with_mcr_steering_fw(gt, reg, rw_flag, group, instance, value);
>  
>  	intel_uncore_forcewake_put__locked(uncore, fw_domains);
>  	spin_unlock_irq(&uncore->lock);
> @@ -347,7 +349,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
>  		      i915_mcr_reg_t reg,
>  		      int group, int instance)
>  {
> -	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0);
> +	return rw_with_mcr_steering(gt, reg, FW_REG_READ, group, instance, 0);
>  }
>  
>  /**
> @@ -364,7 +366,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
>  void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value,
>  				int group, int instance)
>  {
> -	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value);
> +	rw_with_mcr_steering(gt, reg, FW_REG_WRITE, group, instance, value);
>  }
>  
>  /**
> @@ -588,7 +590,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
>  	for (type = 0; type < NUM_STEERING_TYPES; type++) {
>  		if (reg_needs_read_steering(gt, reg, type)) {
>  			get_nonterminated_steering(gt, type, &group, &instance);
> -			return rw_with_mcr_steering_fw(gt->uncore, reg,
> +			return rw_with_mcr_steering_fw(gt, reg,
>  						       FW_REG_READ,
>  						       group, instance, 0);
>  		}
> @@ -615,7 +617,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
>  	for (type = 0; type < NUM_STEERING_TYPES; type++) {
>  		if (reg_needs_read_steering(gt, reg, type)) {
>  			get_nonterminated_steering(gt, type, &group, &instance);
> -			return rw_with_mcr_steering(gt->uncore, reg,
> +			return rw_with_mcr_steering(gt, reg,
>  						    FW_REG_READ,
>  						    group, instance, 0);
>  		}
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/5] drm/i915/gt: Add dedicated MCR lock
  2022-11-28 23:30 ` [PATCH v2 3/5] drm/i915/gt: Add dedicated MCR lock Matt Roper
@ 2022-11-30 15:45   ` Balasubramani Vivekanandan
  0 siblings, 0 replies; 18+ messages in thread
From: Balasubramani Vivekanandan @ 2022-11-30 15:45 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: Mika Kuoppala, Chris Wilson, dri-devel

On 28.11.2022 15:30, Matt Roper wrote:
> We've been overloading uncore->lock to protect access to the MCR
> steering register.  That's not really what uncore->lock is intended for,
> and it would be better if we didn't need to hold such a high-traffic
> spinlock for the whole sequence of (apply steering, access MCR register,
> restore steering).  Let's create a dedicated MCR lock to protect the
> steering control register over this critical section and stop relying on
> the high-traffic uncore->lock.
> 
> For now the new lock is a software lock.  However some platforms (MTL
> and beyond) have a hardware-provided locking mechanism that can be used
> to serialize not only software accesses, but also hardware/firmware
> accesses as well; support for that hardware level lock will be added in
> a future patch.
> 
> v2:
>  - Use irqsave/irqrestore spinlock calls; platforms using execlist
>    submission rather than GuC submission can perform MCR accesses in
>    interrupt context because reset -> errordump happens in a tasklet.
> 
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Regards,
Bala

> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c          |  7 +-
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 79 +++++++++++++++++++--
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.h      |  2 +
>  drivers/gpu/drm/i915/gt/intel_gt_types.h    |  8 +++
>  drivers/gpu/drm/i915/gt/intel_mocs.c        |  3 +
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 12 ++--
>  6 files changed, 101 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index 7ef0edb2e37c..6847f3bd2b03 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -1079,6 +1079,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  	enum intel_engine_id id;
>  	const i915_reg_t *regs;
>  	unsigned int num = 0;
> +	unsigned long flags;
>  
>  	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
>  		regs = NULL;
> @@ -1099,7 +1100,8 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  
>  	intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
>  
> -	spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */
> +	intel_gt_mcr_lock(gt, &flags);
> +	spin_lock(&uncore->lock); /* serialise invalidate with GT reset */
>  
>  	awake = 0;
>  	for_each_engine(engine, gt, id) {
> @@ -1133,7 +1135,8 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  	     IS_ALDERLAKE_P(i915)))
>  		intel_uncore_write_fw(uncore, GEN12_OA_TLB_INV_CR, 1);
>  
> -	spin_unlock_irq(&uncore->lock);
> +	spin_unlock(&uncore->lock);
> +	intel_gt_mcr_unlock(gt, flags);
>  
>  	for_each_engine_masked(engine, gt, awake, tmp) {
>  		struct reg_and_bit rb;
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index f4484bb18ec9..aa070ae57f11 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -143,6 +143,8 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>  	unsigned long fuse;
>  	int i;
>  
> +	spin_lock_init(&gt->mcr_lock);
> +
>  	/*
>  	 * An mslice is unavailable only if both the meml3 for the slice is
>  	 * disabled *and* all of the DSS in the slice (quadrant) are disabled.
> @@ -228,6 +230,7 @@ static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
>   * @instance: instance number (documented as "subsliceid" on older platforms)
>   * @value: register value to be written (ignored for read)
>   *
> + * Context: The caller must hold the MCR lock
>   * Return: 0 for write access. register value for read access.
>   *
>   * Caller needs to make sure the relevant forcewake wells are up.
> @@ -239,7 +242,7 @@ static u32 rw_with_mcr_steering_fw(struct intel_gt *gt,
>  	struct intel_uncore *uncore = gt->uncore;
>  	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
>  
> -	lockdep_assert_held(&uncore->lock);
> +	lockdep_assert_held(&gt->mcr_lock);
>  
>  	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
>  		/*
> @@ -316,6 +319,7 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>  {
>  	struct intel_uncore *uncore = gt->uncore;
>  	enum forcewake_domains fw_domains;
> +	unsigned long flags;
>  	u32 val;
>  
>  	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
> @@ -324,17 +328,59 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>  						     GEN8_MCR_SELECTOR,
>  						     FW_REG_READ | FW_REG_WRITE);
>  
> -	spin_lock_irq(&uncore->lock);
> +	intel_gt_mcr_lock(gt, &flags);
> +	spin_lock(&uncore->lock);
>  	intel_uncore_forcewake_get__locked(uncore, fw_domains);
>  
>  	val = rw_with_mcr_steering_fw(gt, reg, rw_flag, group, instance, value);
>  
>  	intel_uncore_forcewake_put__locked(uncore, fw_domains);
> -	spin_unlock_irq(&uncore->lock);
> +	spin_unlock(&uncore->lock);
> +	intel_gt_mcr_unlock(gt, flags);
>  
>  	return val;
>  }
>  
> +/**
> + * intel_gt_mcr_lock - Acquire MCR steering lock
> + * @gt: GT structure
> + * @flags: storage to save IRQ flags to
> + *
> + * Performs locking to protect the steering for the duration of an MCR
> + * operation.  Depending on the platform, this may be a software lock
> + * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
> + * access not only for the driver, but also for external hardware and
> + * firmware agents).
> + *
> + * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
> + *          function is called, although it may be acquired after this
> + *          function call.
> + */
> +void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
> +{
> +	unsigned long __flags;
> +
> +	lockdep_assert_not_held(&gt->uncore->lock);
> +
> +	spin_lock_irqsave(&gt->mcr_lock, __flags);
> +
> +	*flags = __flags;
> +}
> +
> +/**
> + * intel_gt_mcr_unlock - Release MCR steering lock
> + * @gt: GT structure
> + * @flags: IRQ flags to restore
> + *
> + * Releases the lock acquired by intel_gt_mcr_lock().
> + *
> + * Context: Releases gt->mcr_lock
> + */
> +void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags)
> +{
> +	spin_unlock_irqrestore(&gt->mcr_lock, flags);
> +}
> +
>  /**
>   * intel_gt_mcr_read - read a specific instance of an MCR register
>   * @gt: GT structure
> @@ -342,6 +388,8 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>   * @group: the MCR group
>   * @instance: the MCR instance
>   *
> + * Context: Takes and releases gt->mcr_lock
> + *
>   * Returns the value read from an MCR register after steering toward a specific
>   * group/instance.
>   */
> @@ -362,6 +410,8 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
>   *
>   * Write an MCR register in unicast mode after steering toward a specific
>   * group/instance.
> + *
> + * Context: Calls a function that takes and releases gt->mcr_lock
>   */
>  void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value,
>  				int group, int instance)
> @@ -376,10 +426,16 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 val
>   * @value: value to write
>   *
>   * Write an MCR register in multicast mode to update all instances.
> + *
> + * Context: Takes and releases gt->mcr_lock
>   */
>  void intel_gt_mcr_multicast_write(struct intel_gt *gt,
>  				  i915_mcr_reg_t reg, u32 value)
>  {
> +	unsigned long flags;
> +
> +	intel_gt_mcr_lock(gt, &flags);
> +
>  	/*
>  	 * Ensure we have multicast behavior, just in case some non-i915 agent
>  	 * left the hardware in unicast mode.
> @@ -388,6 +444,8 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
>  		intel_uncore_write_fw(gt->uncore, MTL_MCR_SELECTOR, GEN11_MCR_MULTICAST);
>  
>  	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
> +
> +	intel_gt_mcr_unlock(gt, flags);
>  }
>  
>  /**
> @@ -400,9 +458,13 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
>   * function assumes the caller is already holding any necessary forcewake
>   * domains; use intel_gt_mcr_multicast_write() in cases where forcewake should
>   * be obtained automatically.
> + *
> + * Context: The caller must hold gt->mcr_lock.
>   */
>  void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value)
>  {
> +	lockdep_assert_held(&gt->mcr_lock);
> +
>  	/*
>  	 * Ensure we have multicast behavior, just in case some non-i915 agent
>  	 * left the hardware in unicast mode.
> @@ -429,6 +491,8 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u3
>   * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should
>   * be obtained automatically.
>   *
> + * Context: Calls functions that take and release gt->mcr_lock
> + *
>   * Returns the old (unmodified) value read.
>   */
>  u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
> @@ -580,6 +644,8 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
>   * domains; use intel_gt_mcr_read_any() in cases where forcewake should be
>   * obtained automatically.
>   *
> + * Context: The caller must hold gt->mcr_lock.
> + *
>   * Returns the value from a non-terminated instance of @reg.
>   */
>  u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
> @@ -587,6 +653,8 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
>  	int type;
>  	u8 group, instance;
>  
> +	lockdep_assert_held(&gt->mcr_lock);
> +
>  	for (type = 0; type < NUM_STEERING_TYPES; type++) {
>  		if (reg_needs_read_steering(gt, reg, type)) {
>  			get_nonterminated_steering(gt, type, &group, &instance);
> @@ -607,6 +675,8 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
>   * Reads a GT MCR register.  The read will be steered to a non-terminated
>   * instance (i.e., one that isn't fused off or powered down by power gating).
>   *
> + * Context: Calls a function that takes and releases gt->mcr_lock.
> + *
>   * Returns the value from a non-terminated instance of @reg.
>   */
>  u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
> @@ -730,6 +800,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
>   * Note that this routine assumes the caller holds forcewake asserted, it is
>   * not suitable for very long waits.
>   *
> + * Context: Calls a function that takes and releases gt->mcr_lock
>   * Return: 0 if the register matches the desired condition, or -ETIMEDOUT.
>   */
>  int intel_gt_mcr_wait_for_reg(struct intel_gt *gt,
> @@ -741,7 +812,7 @@ int intel_gt_mcr_wait_for_reg(struct intel_gt *gt,
>  {
>  	int ret;
>  
> -	lockdep_assert_not_held(&gt->uncore->lock);
> +	lockdep_assert_not_held(&gt->mcr_lock);
>  
>  #define done ((intel_gt_mcr_read_any(gt, reg) & mask) == value)
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> index ae93b20e1c17..41684495b7da 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> @@ -9,6 +9,8 @@
>  #include "intel_gt_types.h"
>  
>  void intel_gt_mcr_init(struct intel_gt *gt);
> +void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags);
> +void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags);
>  
>  u32 intel_gt_mcr_read(struct intel_gt *gt,
>  		      i915_mcr_reg_t reg,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> index c1d9cd255e06..76c34c4af867 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> @@ -233,6 +233,14 @@ struct intel_gt {
>  		u8 instanceid;
>  	} default_steering;
>  
> +	/**
> +	 * @mcr_lock: Protects the MCR steering register
> +	 *
> +	 * Protects the MCR steering register (e.g., GEN8_MCR_SELECTOR).
> +	 * Should be taken before uncore->lock in cases where both are desired.
> +	 */
> +	spinlock_t mcr_lock;
> +
>  	/*
>  	 * Base of per-tile GTTMMADR where we can derive the MMIO and the GGTT.
>  	 */
> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
> index 49fdd509527a..69b489e8dfed 100644
> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> @@ -613,14 +613,17 @@ static u32 l3cc_combine(u16 low, u16 high)
>  static void init_l3cc_table(struct intel_gt *gt,
>  			    const struct drm_i915_mocs_table *table)
>  {
> +	unsigned long flags;
>  	unsigned int i;
>  	u32 l3cc;
>  
> +	intel_gt_mcr_lock(gt, &flags);
>  	for_each_l3cc(l3cc, table, i)
>  		if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
>  			intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc);
>  		else
>  			intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc);
> +	intel_gt_mcr_unlock(gt, flags);
>  }
>  
>  void intel_mocs_init_engine(struct intel_engine_cs *engine)
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 1b0e40e68a9d..3e35facac2b4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1760,7 +1760,8 @@ static void wa_list_apply(const struct i915_wa_list *wal)
>  
>  	fw = wal_get_fw_for_rmw(uncore, wal);
>  
> -	spin_lock_irqsave(&uncore->lock, flags);
> +	intel_gt_mcr_lock(gt, &flags);
> +	spin_lock(&uncore->lock);
>  	intel_uncore_forcewake_get__locked(uncore, fw);
>  
>  	for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
> @@ -1789,7 +1790,8 @@ static void wa_list_apply(const struct i915_wa_list *wal)
>  	}
>  
>  	intel_uncore_forcewake_put__locked(uncore, fw);
> -	spin_unlock_irqrestore(&uncore->lock, flags);
> +	spin_unlock(&uncore->lock);
> +	intel_gt_mcr_unlock(gt, flags);
>  }
>  
>  void intel_gt_apply_workarounds(struct intel_gt *gt)
> @@ -1810,7 +1812,8 @@ static bool wa_list_verify(struct intel_gt *gt,
>  
>  	fw = wal_get_fw_for_rmw(uncore, wal);
>  
> -	spin_lock_irqsave(&uncore->lock, flags);
> +	intel_gt_mcr_lock(gt, &flags);
> +	spin_lock(&uncore->lock);
>  	intel_uncore_forcewake_get__locked(uncore, fw);
>  
>  	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
> @@ -1820,7 +1823,8 @@ static bool wa_list_verify(struct intel_gt *gt,
>  				wal->name, from);
>  
>  	intel_uncore_forcewake_put__locked(uncore, fw);
> -	spin_unlock_irqrestore(&uncore->lock, flags);
> +	spin_unlock(&uncore->lock);
> +	intel_gt_mcr_unlock(gt, flags);
>  
>  	return ok;
>  }
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup
  2022-11-28 23:30 ` [PATCH v2 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup Matt Roper
@ 2022-11-30 15:51   ` Balasubramani Vivekanandan
  2022-11-30 15:58     ` [PATCH v3 " Matt Roper
  2022-11-30 16:05     ` [PATCH v2 " Matt Roper
  0 siblings, 2 replies; 18+ messages in thread
From: Balasubramani Vivekanandan @ 2022-11-30 15:51 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: dri-devel

On 28.11.2022 15:30, Matt Roper wrote:
> PPAT setup involves a series of multicast writes.  This can be optimized
> slightly be acquiring forcewake and the steering lock just once for the
> entire sequence.
> 
> Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++--------
>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 2ba3983984b9..288d9f118ee9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
>  
>  static void xehp_setup_private_ppat(struct intel_gt *gt)
>  {
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
> +	enum forcewake_domains fw;
> +	unsigned long flags;
> +
> +	fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg),
> +					    FW_REG_READ);

I am not completely aware of forcewake implementation. I am wondering if
the last parameter should be FW_REG_WRITE since it is register write
which is happening later.

Regards,
Bala

> +	intel_uncore_forcewake_get(gt->uncore, fw);
> +
> +	intel_gt_mcr_lock(gt, &flags);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
> +	intel_gt_mcr_unlock(gt, flags);
> +
> +	intel_uncore_forcewake_put(gt->uncore, fw);
>  }
>  
>  static void icl_setup_private_ppat(struct intel_uncore *uncore)
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup
  2022-11-30 15:51   ` Balasubramani Vivekanandan
@ 2022-11-30 15:58     ` Matt Roper
  2022-12-01  9:26       ` Balasubramani Vivekanandan
  2022-11-30 16:05     ` [PATCH v2 " Matt Roper
  1 sibling, 1 reply; 18+ messages in thread
From: Matt Roper @ 2022-11-30 15:58 UTC (permalink / raw)
  To: intel-gfx; +Cc: Balasubramani Vivekanandan, dri-devel

PPAT setup involves a series of multicast writes.  This can be optimized
slightly be acquiring forcewake and the steering lock just once for the
entire sequence.

v2:
 - We should use FW_REG_WRITE instead of FW_REG_READ.  (Bala)

Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2ba3983984b9..e37164a60d37 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 
 static void xehp_setup_private_ppat(struct intel_gt *gt)
 {
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
-	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
+	enum forcewake_domains fw;
+	unsigned long flags;
+
+	fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg),
+					    FW_REG_WRITE);
+	intel_uncore_forcewake_get(gt->uncore, fw);
+
+	intel_gt_mcr_lock(gt, &flags);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
+	intel_gt_mcr_unlock(gt, flags);
+
+	intel_uncore_forcewake_put(gt->uncore, fw);
 }
 
 static void icl_setup_private_ppat(struct intel_uncore *uncore)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup
  2022-11-30 15:51   ` Balasubramani Vivekanandan
  2022-11-30 15:58     ` [PATCH v3 " Matt Roper
@ 2022-11-30 16:05     ` Matt Roper
  1 sibling, 0 replies; 18+ messages in thread
From: Matt Roper @ 2022-11-30 16:05 UTC (permalink / raw)
  To: Balasubramani Vivekanandan; +Cc: intel-gfx, dri-devel

On Wed, Nov 30, 2022 at 09:21:07PM +0530, Balasubramani Vivekanandan wrote:
> On 28.11.2022 15:30, Matt Roper wrote:
> > PPAT setup involves a series of multicast writes.  This can be optimized
> > slightly be acquiring forcewake and the steering lock just once for the
> > entire sequence.
> > 
> > Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++--------
> >  1 file changed, 19 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > index 2ba3983984b9..288d9f118ee9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > @@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
> >  
> >  static void xehp_setup_private_ppat(struct intel_gt *gt)
> >  {
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
> > +	enum forcewake_domains fw;
> > +	unsigned long flags;
> > +
> > +	fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg),
> > +					    FW_REG_READ);
> 
> I am not completely aware of forcewake implementation. I am wondering if
> the last parameter should be FW_REG_WRITE since it is register write
> which is happening later.

Yep, good catch.  Using FW_REG_WRITE allows the driver to potentially
skip obtaining forcewake and waking the hardware if the registers being
written are "shadowed" so it's always good to use FW_REG_WRITE in places
where we're only writing and not reading.

In this case the specific registers in question don't appear to be part
of the shadow table for any affected platforms (e.g.,
dg2_shadowed_regs[] and such in intel_uncore.c), so FW_REG_WRITE will
still behave the same as FW_REG_READ here.  But it's always possible
that future platforms could decide to shadow these registers, so it's
good to fix anyway; I just sent an updated copy of this patch making
that change.


Matt

> 
> Regards,
> Bala
> 
> > +	intel_uncore_forcewake_get(gt->uncore, fw);
> > +
> > +	intel_gt_mcr_lock(gt, &flags);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
> > +	intel_gt_mcr_unlock(gt, flags);
> > +
> > +	intel_uncore_forcewake_put(gt->uncore, fw);
> >  }
> >  
> >  static void icl_setup_private_ppat(struct intel_uncore *uncore)
> > -- 
> > 2.38.1
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup
  2022-11-30 15:58     ` [PATCH v3 " Matt Roper
@ 2022-12-01  9:26       ` Balasubramani Vivekanandan
  2022-12-01 21:01         ` Matt Roper
  0 siblings, 1 reply; 18+ messages in thread
From: Balasubramani Vivekanandan @ 2022-12-01  9:26 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: dri-devel

On 30.11.2022 07:58, Matt Roper wrote:
> PPAT setup involves a series of multicast writes.  This can be optimized
> slightly be acquiring forcewake and the steering lock just once for the
> entire sequence.
> 
> v2:
>  - We should use FW_REG_WRITE instead of FW_REG_READ.  (Bala)
> 
> Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Regards,
Bala

> ---
>  drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++--------
>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 2ba3983984b9..e37164a60d37 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
>  
>  static void xehp_setup_private_ppat(struct intel_gt *gt)
>  {
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
> -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
> +	enum forcewake_domains fw;
> +	unsigned long flags;
> +
> +	fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg),
> +					    FW_REG_WRITE);
> +	intel_uncore_forcewake_get(gt->uncore, fw);
> +
> +	intel_gt_mcr_lock(gt, &flags);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
> +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
> +	intel_gt_mcr_unlock(gt, flags);
> +
> +	intel_uncore_forcewake_put(gt->uncore, fw);
>  }
>  
>  static void icl_setup_private_ppat(struct intel_uncore *uncore)
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup
  2022-12-01  9:26       ` Balasubramani Vivekanandan
@ 2022-12-01 21:01         ` Matt Roper
  0 siblings, 0 replies; 18+ messages in thread
From: Matt Roper @ 2022-12-01 21:01 UTC (permalink / raw)
  To: Balasubramani Vivekanandan; +Cc: intel-gfx, dri-devel

On Thu, Dec 01, 2022 at 02:56:30PM +0530, Balasubramani Vivekanandan wrote:
> On 30.11.2022 07:58, Matt Roper wrote:
> > PPAT setup involves a series of multicast writes.  This can be optimized
> > slightly be acquiring forcewake and the steering lock just once for the
> > entire sequence.
> > 
> > v2:
> >  - We should use FW_REG_WRITE instead of FW_REG_READ.  (Bala)
> > 
> > Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> 
> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Thanks.  Since this patch is independent of patch #4 (the only one that
hasn't been reviewed yet), I went ahead and pushed this one to
drm-intel-gt-next.  BTW, I noticed I wrote "mtl" in the patch title
where I actually meant to have "mcr" (this isn't a MTL-specific change),
so I corrected that typo while pushing as well.


Matt

> 
> Regards,
> Bala
> 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++--------
> >  1 file changed, 19 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > index 2ba3983984b9..e37164a60d37 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > @@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
> >  
> >  static void xehp_setup_private_ppat(struct intel_gt *gt)
> >  {
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
> > -	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
> > +	enum forcewake_domains fw;
> > +	unsigned long flags;
> > +
> > +	fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg),
> > +					    FW_REG_WRITE);
> > +	intel_uncore_forcewake_get(gt->uncore, fw);
> > +
> > +	intel_gt_mcr_lock(gt, &flags);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
> > +	intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
> > +	intel_gt_mcr_unlock(gt, flags);
> > +
> > +	intel_uncore_forcewake_put(gt->uncore, fw);
> >  }
> >  
> >  static void icl_setup_private_ppat(struct intel_uncore *uncore)
> > -- 
> > 2.38.1
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Intel-gfx] [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering
  2022-11-28 23:30 ` [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering Matt Roper
@ 2022-12-02 14:46   ` Balasubramani Vivekanandan
  2022-12-05  8:58   ` Tvrtko Ursulin
  1 sibling, 0 replies; 18+ messages in thread
From: Balasubramani Vivekanandan @ 2022-12-02 14:46 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: dri-devel

On 28.11.2022 15:30, Matt Roper wrote:
> Starting with MTL, the driver needs to not only protect the steering
> control register from simultaneous software accesses, but also protect
> against races with hardware/firmware agents.  The hardware provides a
> dedicated locking mechanism to support this via the MTL_STEER_SEMAPHORE
> register.  Reading the register acts as a 'trylock' operation; the read
> will return 0x1 if the lock is acquired or 0x0 if something else is
> already holding the lock; once acquired, writing 0x1 to the register
> will release the lock.
> 
> We'll continue to grab the software lock as well, just so lockdep can
> track our locking; assuming the hardware lock is behaving properly,
> there should never be any contention on the software lock in this case.
> 
> v2:
>  - Extend hardware semaphore timeout and add a taint for CI if it ever
>    happens (this would imply misbehaving hardware/firmware).  (Mika)
>  - Add "MTL_" prefix to new steering semaphore register.  (Mika)
> 
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>

Regards,
Bala

> ---
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.c  | 38 ++++++++++++++++++++++---
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h |  1 +
>  2 files changed, 35 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index aa070ae57f11..087e4ac5b68d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -347,10 +347,9 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>   * @flags: storage to save IRQ flags to
>   *
>   * Performs locking to protect the steering for the duration of an MCR
> - * operation.  Depending on the platform, this may be a software lock
> - * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
> - * access not only for the driver, but also for external hardware and
> - * firmware agents).
> + * operation.  On MTL and beyond, a hardware lock will also be taken to
> + * serialize access not only for the driver, but also for external hardware and
> + * firmware agents.
>   *
>   * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
>   *          function is called, although it may be acquired after this
> @@ -359,12 +358,40 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>  void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
>  {
>  	unsigned long __flags;
> +	int err = 0;
>  
>  	lockdep_assert_not_held(&gt->uncore->lock);
>  
> +	/*
> +	 * Starting with MTL, we need to coordinate not only with other
> +	 * driver threads, but also with hardware/firmware agents.  A dedicated
> +	 * locking register is used.
> +	 */
> +	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
> +		err = wait_for(intel_uncore_read_fw(gt->uncore,
> +						    MTL_STEER_SEMAPHORE) == 0x1, 100);
> +
> +	/*
> +	 * Even on platforms with a hardware lock, we'll continue to grab
> +	 * a software spinlock too for lockdep purposes.  If the hardware lock
> +	 * was already acquired, there should never be contention on the
> +	 * software lock.
> +	 */
>  	spin_lock_irqsave(&gt->mcr_lock, __flags);
>  
>  	*flags = __flags;
> +
> +	/*
> +	 * In theory we should never fail to acquire the HW semaphore; this
> +	 * would indicate some hardware/firmware is misbehaving and not
> +	 * releasing it properly.
> +	 */
> +	if (err == -ETIMEDOUT) {
> +		drm_err_ratelimited(&gt->i915->drm,
> +				    "GT%u hardware MCR steering semaphore timed out",
> +				    gt->info.id);
> +		add_taint_for_CI(gt->i915, TAINT_WARN);  /* CI is now unreliable */
> +	}
>  }
>  
>  /**
> @@ -379,6 +406,9 @@ void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
>  void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags)
>  {
>  	spin_unlock_irqrestore(&gt->mcr_lock, flags);
> +
> +	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
> +		intel_uncore_write_fw(gt->uncore, MTL_STEER_SEMAPHORE, 0x1);
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index 784152548472..1618d46cb8c7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -67,6 +67,7 @@
>  #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
>  
>  #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
> +#define MTL_STEER_SEMAPHORE			_MMIO(0xfd0)
>  #define MTL_MCR_SELECTOR			_MMIO(0xfd4)
>  #define SF_MCR_SELECTOR				_MMIO(0xfd8)
>  #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Intel-gfx] [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering
  2022-11-28 23:30 ` [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering Matt Roper
  2022-12-02 14:46   ` [Intel-gfx] " Balasubramani Vivekanandan
@ 2022-12-05  8:58   ` Tvrtko Ursulin
  2022-12-05 15:52     ` Matt Roper
  1 sibling, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2022-12-05  8:58 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: dri-devel


On 28/11/2022 23:30, Matt Roper wrote:
> Starting with MTL, the driver needs to not only protect the steering
> control register from simultaneous software accesses, but also protect
> against races with hardware/firmware agents.  The hardware provides a
> dedicated locking mechanism to support this via the MTL_STEER_SEMAPHORE
> register.  Reading the register acts as a 'trylock' operation; the read
> will return 0x1 if the lock is acquired or 0x0 if something else is
> already holding the lock; once acquired, writing 0x1 to the register
> will release the lock.
> 
> We'll continue to grab the software lock as well, just so lockdep can
> track our locking; assuming the hardware lock is behaving properly,
> there should never be any contention on the software lock in this case.
> 
> v2:
>   - Extend hardware semaphore timeout and add a taint for CI if it ever
>     happens (this would imply misbehaving hardware/firmware).  (Mika)
>   - Add "MTL_" prefix to new steering semaphore register.  (Mika)
> 
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_mcr.c  | 38 ++++++++++++++++++++++---
>   drivers/gpu/drm/i915/gt/intel_gt_regs.h |  1 +
>   2 files changed, 35 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index aa070ae57f11..087e4ac5b68d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -347,10 +347,9 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>    * @flags: storage to save IRQ flags to
>    *
>    * Performs locking to protect the steering for the duration of an MCR
> - * operation.  Depending on the platform, this may be a software lock
> - * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
> - * access not only for the driver, but also for external hardware and
> - * firmware agents).
> + * operation.  On MTL and beyond, a hardware lock will also be taken to
> + * serialize access not only for the driver, but also for external hardware and
> + * firmware agents.
>    *
>    * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
>    *          function is called, although it may be acquired after this
> @@ -359,12 +358,40 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>   void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
>   {
>   	unsigned long __flags;
> +	int err = 0;
>   
>   	lockdep_assert_not_held(&gt->uncore->lock);
>   
> +	/*
> +	 * Starting with MTL, we need to coordinate not only with other
> +	 * driver threads, but also with hardware/firmware agents.  A dedicated
> +	 * locking register is used.
> +	 */
> +	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
> +		err = wait_for(intel_uncore_read_fw(gt->uncore,
> +						    MTL_STEER_SEMAPHORE) == 0x1, 100);
> +

If two i915 threads enter here what happens? (Given hw locking is done 
before the spinlock.)

Regards,

Tvrtko

> +	/*
> +	 * Even on platforms with a hardware lock, we'll continue to grab
> +	 * a software spinlock too for lockdep purposes.  If the hardware lock
> +	 * was already acquired, there should never be contention on the
> +	 * software lock.
> +	 */
>   	spin_lock_irqsave(&gt->mcr_lock, __flags);
>   
>   	*flags = __flags;
> +
> +	/*
> +	 * In theory we should never fail to acquire the HW semaphore; this
> +	 * would indicate some hardware/firmware is misbehaving and not
> +	 * releasing it properly.
> +	 */
> +	if (err == -ETIMEDOUT) {
> +		drm_err_ratelimited(&gt->i915->drm,
> +				    "GT%u hardware MCR steering semaphore timed out",
> +				    gt->info.id);
> +		add_taint_for_CI(gt->i915, TAINT_WARN);  /* CI is now unreliable */
> +	}
>   }
>   
>   /**
> @@ -379,6 +406,9 @@ void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
>   void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags)
>   {
>   	spin_unlock_irqrestore(&gt->mcr_lock, flags);
> +
> +	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
> +		intel_uncore_write_fw(gt->uncore, MTL_STEER_SEMAPHORE, 0x1);
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index 784152548472..1618d46cb8c7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -67,6 +67,7 @@
>   #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
>   
>   #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
> +#define MTL_STEER_SEMAPHORE			_MMIO(0xfd0)
>   #define MTL_MCR_SELECTOR			_MMIO(0xfd4)
>   #define SF_MCR_SELECTOR				_MMIO(0xfd8)
>   #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Intel-gfx] [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering
  2022-12-05  8:58   ` Tvrtko Ursulin
@ 2022-12-05 15:52     ` Matt Roper
  2022-12-05 18:16       ` Tvrtko Ursulin
  0 siblings, 1 reply; 18+ messages in thread
From: Matt Roper @ 2022-12-05 15:52 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel

On Mon, Dec 05, 2022 at 08:58:16AM +0000, Tvrtko Ursulin wrote:
> 
> On 28/11/2022 23:30, Matt Roper wrote:
> > Starting with MTL, the driver needs to not only protect the steering
> > control register from simultaneous software accesses, but also protect
> > against races with hardware/firmware agents.  The hardware provides a
> > dedicated locking mechanism to support this via the MTL_STEER_SEMAPHORE
> > register.  Reading the register acts as a 'trylock' operation; the read
> > will return 0x1 if the lock is acquired or 0x0 if something else is
> > already holding the lock; once acquired, writing 0x1 to the register
> > will release the lock.
> > 
> > We'll continue to grab the software lock as well, just so lockdep can
> > track our locking; assuming the hardware lock is behaving properly,
> > there should never be any contention on the software lock in this case.
> > 
> > v2:
> >   - Extend hardware semaphore timeout and add a taint for CI if it ever
> >     happens (this would imply misbehaving hardware/firmware).  (Mika)
> >   - Add "MTL_" prefix to new steering semaphore register.  (Mika)
> > 
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_mcr.c  | 38 ++++++++++++++++++++++---
> >   drivers/gpu/drm/i915/gt/intel_gt_regs.h |  1 +
> >   2 files changed, 35 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > index aa070ae57f11..087e4ac5b68d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > @@ -347,10 +347,9 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
> >    * @flags: storage to save IRQ flags to
> >    *
> >    * Performs locking to protect the steering for the duration of an MCR
> > - * operation.  Depending on the platform, this may be a software lock
> > - * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
> > - * access not only for the driver, but also for external hardware and
> > - * firmware agents).
> > + * operation.  On MTL and beyond, a hardware lock will also be taken to
> > + * serialize access not only for the driver, but also for external hardware and
> > + * firmware agents.
> >    *
> >    * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
> >    *          function is called, although it may be acquired after this
> > @@ -359,12 +358,40 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
> >   void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
> >   {
> >   	unsigned long __flags;
> > +	int err = 0;
> >   	lockdep_assert_not_held(&gt->uncore->lock);
> > +	/*
> > +	 * Starting with MTL, we need to coordinate not only with other
> > +	 * driver threads, but also with hardware/firmware agents.  A dedicated
> > +	 * locking register is used.
> > +	 */
> > +	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
> > +		err = wait_for(intel_uncore_read_fw(gt->uncore,
> > +						    MTL_STEER_SEMAPHORE) == 0x1, 100);
> > +
> 
> If two i915 threads enter here what happens? (Given hw locking is done
> before the spinlock.)

The second thread will see a '0' when it reads the register, indicating
that something else (sw, fw, or hw) already has it locked.  As soon as
the first thread drops the lock, the next read will return '1' and allow
the second thread to proceed.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> > +	/*
> > +	 * Even on platforms with a hardware lock, we'll continue to grab
> > +	 * a software spinlock too for lockdep purposes.  If the hardware lock
> > +	 * was already acquired, there should never be contention on the
> > +	 * software lock.
> > +	 */
> >   	spin_lock_irqsave(&gt->mcr_lock, __flags);
> >   	*flags = __flags;
> > +
> > +	/*
> > +	 * In theory we should never fail to acquire the HW semaphore; this
> > +	 * would indicate some hardware/firmware is misbehaving and not
> > +	 * releasing it properly.
> > +	 */
> > +	if (err == -ETIMEDOUT) {
> > +		drm_err_ratelimited(&gt->i915->drm,
> > +				    "GT%u hardware MCR steering semaphore timed out",
> > +				    gt->info.id);
> > +		add_taint_for_CI(gt->i915, TAINT_WARN);  /* CI is now unreliable */
> > +	}
> >   }
> >   /**
> > @@ -379,6 +406,9 @@ void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
> >   void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags)
> >   {
> >   	spin_unlock_irqrestore(&gt->mcr_lock, flags);
> > +
> > +	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
> > +		intel_uncore_write_fw(gt->uncore, MTL_STEER_SEMAPHORE, 0x1);
> >   }
> >   /**
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > index 784152548472..1618d46cb8c7 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > @@ -67,6 +67,7 @@
> >   #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
> >   #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
> > +#define MTL_STEER_SEMAPHORE			_MMIO(0xfd0)
> >   #define MTL_MCR_SELECTOR			_MMIO(0xfd4)
> >   #define SF_MCR_SELECTOR				_MMIO(0xfd8)
> >   #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Intel-gfx] [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering
  2022-12-05 15:52     ` Matt Roper
@ 2022-12-05 18:16       ` Tvrtko Ursulin
  0 siblings, 0 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2022-12-05 18:16 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, dri-devel


On 05/12/2022 15:52, Matt Roper wrote:
> On Mon, Dec 05, 2022 at 08:58:16AM +0000, Tvrtko Ursulin wrote:
>>
>> On 28/11/2022 23:30, Matt Roper wrote:
>>> Starting with MTL, the driver needs to not only protect the steering
>>> control register from simultaneous software accesses, but also protect
>>> against races with hardware/firmware agents.  The hardware provides a
>>> dedicated locking mechanism to support this via the MTL_STEER_SEMAPHORE
>>> register.  Reading the register acts as a 'trylock' operation; the read
>>> will return 0x1 if the lock is acquired or 0x0 if something else is
>>> already holding the lock; once acquired, writing 0x1 to the register
>>> will release the lock.
>>>
>>> We'll continue to grab the software lock as well, just so lockdep can
>>> track our locking; assuming the hardware lock is behaving properly,
>>> there should never be any contention on the software lock in this case.
>>>
>>> v2:
>>>    - Extend hardware semaphore timeout and add a taint for CI if it ever
>>>      happens (this would imply misbehaving hardware/firmware).  (Mika)
>>>    - Add "MTL_" prefix to new steering semaphore register.  (Mika)
>>>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_gt_mcr.c  | 38 ++++++++++++++++++++++---
>>>    drivers/gpu/drm/i915/gt/intel_gt_regs.h |  1 +
>>>    2 files changed, 35 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> index aa070ae57f11..087e4ac5b68d 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> @@ -347,10 +347,9 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>>>     * @flags: storage to save IRQ flags to
>>>     *
>>>     * Performs locking to protect the steering for the duration of an MCR
>>> - * operation.  Depending on the platform, this may be a software lock
>>> - * (gt->mcr_lock) or a hardware lock (i.e., a register that synchronizes
>>> - * access not only for the driver, but also for external hardware and
>>> - * firmware agents).
>>> + * operation.  On MTL and beyond, a hardware lock will also be taken to
>>> + * serialize access not only for the driver, but also for external hardware and
>>> + * firmware agents.
>>>     *
>>>     * Context: Takes gt->mcr_lock.  uncore->lock should *not* be held when this
>>>     *          function is called, although it may be acquired after this
>>> @@ -359,12 +358,40 @@ static u32 rw_with_mcr_steering(struct intel_gt *gt,
>>>    void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
>>>    {
>>>    	unsigned long __flags;
>>> +	int err = 0;
>>>    	lockdep_assert_not_held(&gt->uncore->lock);
>>> +	/*
>>> +	 * Starting with MTL, we need to coordinate not only with other
>>> +	 * driver threads, but also with hardware/firmware agents.  A dedicated
>>> +	 * locking register is used.
>>> +	 */
>>> +	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
>>> +		err = wait_for(intel_uncore_read_fw(gt->uncore,
>>> +						    MTL_STEER_SEMAPHORE) == 0x1, 100);
>>> +
>>
>> If two i915 threads enter here what happens? (Given hw locking is done
>> before the spinlock.)
> 
> The second thread will see a '0' when it reads the register, indicating
> that something else (sw, fw, or hw) already has it locked.  As soon as
> the first thread drops the lock, the next read will return '1' and allow
> the second thread to proceed.

I was worried if there was a concept of request originator, but this 
then sounds good.

Regards,

Tvrtko

>>> +	/*
>>> +	 * Even on platforms with a hardware lock, we'll continue to grab
>>> +	 * a software spinlock too for lockdep purposes.  If the hardware lock
>>> +	 * was already acquired, there should never be contention on the
>>> +	 * software lock.
>>> +	 */
>>>    	spin_lock_irqsave(&gt->mcr_lock, __flags);
>>>    	*flags = __flags;
>>> +
>>> +	/*
>>> +	 * In theory we should never fail to acquire the HW semaphore; this
>>> +	 * would indicate some hardware/firmware is misbehaving and not
>>> +	 * releasing it properly.
>>> +	 */
>>> +	if (err == -ETIMEDOUT) {
>>> +		drm_err_ratelimited(&gt->i915->drm,
>>> +				    "GT%u hardware MCR steering semaphore timed out",
>>> +				    gt->info.id);
>>> +		add_taint_for_CI(gt->i915, TAINT_WARN);  /* CI is now unreliable */
>>> +	}
>>>    }
>>>    /**
>>> @@ -379,6 +406,9 @@ void intel_gt_mcr_lock(struct intel_gt *gt, unsigned long *flags)
>>>    void intel_gt_mcr_unlock(struct intel_gt *gt, unsigned long flags)
>>>    {
>>>    	spin_unlock_irqrestore(&gt->mcr_lock, flags);
>>> +
>>> +	if (GRAPHICS_VER(gt->i915) >= IP_VER(12, 70))
>>> +		intel_uncore_write_fw(gt->uncore, MTL_STEER_SEMAPHORE, 0x1);
>>>    }
>>>    /**
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
>>> index 784152548472..1618d46cb8c7 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
>>> @@ -67,6 +67,7 @@
>>>    #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
>>>    #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
>>> +#define MTL_STEER_SEMAPHORE			_MMIO(0xfd0)
>>>    #define MTL_MCR_SELECTOR			_MMIO(0xfd4)
>>>    #define SF_MCR_SELECTOR				_MMIO(0xfd8)
>>>    #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-12-05 18:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-28 23:30 [PATCH v2 0/5] i915: dedicated MCR locking and hardware semaphore Matt Roper
2022-11-28 23:30 ` [PATCH v2 1/5] drm/i915/gt: Correct kerneldoc for intel_gt_mcr_wait_for_reg() Matt Roper
2022-11-30 15:42   ` Balasubramani Vivekanandan
2022-11-28 23:30 ` [PATCH v2 2/5] drm/i915/gt: Pass gt rather than uncore to lowest-level reads/writes Matt Roper
2022-11-30 15:43   ` Balasubramani Vivekanandan
2022-11-28 23:30 ` [PATCH v2 3/5] drm/i915/gt: Add dedicated MCR lock Matt Roper
2022-11-30 15:45   ` Balasubramani Vivekanandan
2022-11-28 23:30 ` [PATCH v2 4/5] drm/i915/mtl: Add hardware-level lock for steering Matt Roper
2022-12-02 14:46   ` [Intel-gfx] " Balasubramani Vivekanandan
2022-12-05  8:58   ` Tvrtko Ursulin
2022-12-05 15:52     ` Matt Roper
2022-12-05 18:16       ` Tvrtko Ursulin
2022-11-28 23:30 ` [PATCH v2 5/5] drm/i915/mtl: Hold forcewake and MCR lock over PPAT setup Matt Roper
2022-11-30 15:51   ` Balasubramani Vivekanandan
2022-11-30 15:58     ` [PATCH v3 " Matt Roper
2022-12-01  9:26       ` Balasubramani Vivekanandan
2022-12-01 21:01         ` Matt Roper
2022-11-30 16:05     ` [PATCH v2 " Matt Roper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).