[Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware

intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware
@ 2023-04-06  4:45 Ashutosh Dixit
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Ashutosh Dixit @ 2023-04-06  4:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi

Split the v3 patch into 3 patches for easier review, can squash later if needed.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>

Ashutosh Dixit (3):
  drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write
  drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  drm/i915/hwmon: Block waiting for GuC reset to complete

 drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 ++++
 drivers/gpu/drm/i915/i915_hwmon.c     | 75 ++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_hwmon.h     |  7 +++
 3 files changed, 78 insertions(+), 13 deletions(-)

-- 
2.38.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write
  2023-04-06  4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
@ 2023-04-06  4:45 ` Ashutosh Dixit
  2023-04-07 11:08   ` Rodrigo Vivi
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Ashutosh Dixit @ 2023-04-06  4:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi

In preparation for follow-on patches, refactor hwm_power_max_write to take
hwmon_lock and runtime pm wakeref at start of the function and release them
at the end, therefore acquiring these just once each.

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/gpu/drm/i915/i915_hwmon.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
index 8e7dccc8d3a0e..7f44e809ca155 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -396,31 +396,33 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 {
 	struct i915_hwmon *hwmon = ddat->hwmon;
 	intel_wakeref_t wakeref;
+	int ret = 0;
 	u32 nval;
 
+	mutex_lock(&hwmon->hwmon_lock);
+	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
+
 	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
 	if (val == PL1_DISABLE) {
-		mutex_lock(&hwmon->hwmon_lock);
-		with_intel_runtime_pm(ddat->uncore->rpm, wakeref) {
-			intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
-					 PKG_PWR_LIM_1_EN, 0);
-			nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
-		}
-		mutex_unlock(&hwmon->hwmon_lock);
+		intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
+				 PKG_PWR_LIM_1_EN, 0);
+		nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
 
 		if (nval & PKG_PWR_LIM_1_EN)
-			return -ENODEV;
-		return 0;
+			ret = -ENODEV;
+		goto exit;
 	}
 
 	/* Computation in 64-bits to avoid overflow. Round to nearest. */
 	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
 	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
 
-	hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
-					    PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1,
-					    nval);
-	return 0;
+	intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
+			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
+exit:
+	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
+	mutex_unlock(&hwmon->hwmon_lock);
+	return ret;
 }
 
 static int
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  2023-04-06  4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit
@ 2023-04-06  4:45 ` Ashutosh Dixit
  2023-04-06  9:16   ` kernel test robot
  2023-04-07 11:08   ` Rodrigo Vivi
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Ashutosh Dixit @ 2023-04-06  4:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi

On dGfx, the PL1 power limit being enabled and set to a low value results
in a low GPU operating freq. It also negates the freq raise operation which
is done before GuC firmware load. As a result GuC firmware load can time
out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
limit was enabled and set to a low value). Therefore disable the PL1 power
limit when allowed by HW when loading GuC firmware.

v2:
 - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
 - Add hwm_power_max_restore to error return code path

v3 (Jani N):
 - Add/remove explanatory comments
 - Function renames
 - Type corrections
 - Locking annotation

v4:
 - Don't hold the lock across GuC reset (Rodrigo)
 - New locking scheme (suggested by Rodrigo)
 - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko)

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 ++++++
 drivers/gpu/drm/i915/i915_hwmon.c     | 40 +++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_hwmon.h     |  7 +++++
 3 files changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 4ccb4be4c9cba..aa8e35a5636a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -18,6 +18,7 @@
 #include "intel_uc.h"
 
 #include "i915_drv.h"
+#include "i915_hwmon.h"
 
 static const struct intel_uc_ops uc_ops_off;
 static const struct intel_uc_ops uc_ops_on;
@@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
 	struct intel_guc *guc = &uc->guc;
 	struct intel_huc *huc = &uc->huc;
 	int ret, attempts;
+	bool pl1en;
 
 	GEM_BUG_ON(!intel_uc_supports_guc(uc));
 	GEM_BUG_ON(!intel_uc_wants_guc(uc));
@@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
 	else
 		attempts = 1;
 
+	/* Disable a potentially low PL1 power limit to allow freq to be raised */
+	i915_hwmon_power_max_disable(gt->i915, &pl1en);
+
 	intel_rps_raise_unslice(&uc_to_gt(uc)->rps);
 
 	while (attempts--) {
@@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
 		intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
 	}
 
+	i915_hwmon_power_max_restore(gt->i915, pl1en);
+
 	guc_info(guc, "submission %s\n", str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
 	guc_info(guc, "SLPC %s\n", str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
 
@@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
 	/* Return GT back to RPn */
 	intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
 
+	i915_hwmon_power_max_restore(gt->i915, pl1en);
+
 	__uc_sanitize(uc);
 
 	if (!ret) {
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
index 7f44e809ca155..9ab8971679fe3 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -50,6 +50,7 @@ struct hwm_drvdata {
 	struct hwm_energy_info ei;		/*  Energy info for energy1_input */
 	char name[12];
 	int gt_n;
+	bool reset_in_progress;
 };
 
 struct i915_hwmon {
@@ -400,6 +401,10 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 	u32 nval;
 
 	mutex_lock(&hwmon->hwmon_lock);
+	if (hwmon->ddat.reset_in_progress) {
+		ret = -EAGAIN;
+		goto unlock;
+	}
 	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
 
 	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
@@ -421,6 +426,7 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
 exit:
 	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
+unlock:
 	mutex_unlock(&hwmon->hwmon_lock);
 	return ret;
 }
@@ -472,6 +478,40 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, int chan, long val)
 	}
 }
 
+void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old)
+{
+	struct i915_hwmon *hwmon = i915->hwmon;
+	u32 r;
+
+	if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
+		return;
+
+	mutex_lock(&hwmon->hwmon_lock);
+
+	hwmon->ddat.reset_in_progress = true;
+	r = intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
+			     PKG_PWR_LIM_1_EN, 0);
+	*old = !!(r & PKG_PWR_LIM_1_EN);
+
+	mutex_unlock(&hwmon->hwmon_lock);
+}
+
+void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
+{
+	struct i915_hwmon *hwmon = i915->hwmon;
+
+	if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
+		return;
+
+	mutex_lock(&hwmon->hwmon_lock);
+
+	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
+			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
+	hwmon->ddat.reset_in_progress = false;
+
+	mutex_unlock(&hwmon->hwmon_lock);
+}
+
 static umode_t
 hwm_energy_is_visible(const struct hwm_drvdata *ddat, u32 attr)
 {
diff --git a/drivers/gpu/drm/i915/i915_hwmon.h b/drivers/gpu/drm/i915/i915_hwmon.h
index 7ca9cf2c34c96..0fcb7de844061 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.h
+++ b/drivers/gpu/drm/i915/i915_hwmon.h
@@ -7,14 +7,21 @@
 #ifndef __I915_HWMON_H__
 #define __I915_HWMON_H__
 
+#include <linux/types.h>
+
 struct drm_i915_private;
+struct intel_gt;
 
 #if IS_REACHABLE(CONFIG_HWMON)
 void i915_hwmon_register(struct drm_i915_private *i915);
 void i915_hwmon_unregister(struct drm_i915_private *i915);
+void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old);
+void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old);
 #else
 static inline void i915_hwmon_register(struct drm_i915_private *i915) { };
 static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { };
+static inline void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old) { };
+static inline void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) { };
 #endif
 
 #endif /* __I915_HWMON_H__ */
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-06  4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
@ 2023-04-06  4:45 ` Ashutosh Dixit
  2023-04-07 11:04   ` Rodrigo Vivi
  2023-04-06  5:15 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware Patchwork
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Ashutosh Dixit @ 2023-04-06  4:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi

Instead of erroring out when GuC reset is in progress, block waiting for
GuC reset to complete which is a more reasonable uapi behavior.

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/gpu/drm/i915/i915_hwmon.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
index 9ab8971679fe3..4343efb48e61b 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -51,6 +51,7 @@ struct hwm_drvdata {
 	char name[12];
 	int gt_n;
 	bool reset_in_progress;
+	wait_queue_head_t wqh;
 };
 
 struct i915_hwmon {
@@ -400,10 +401,15 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 	int ret = 0;
 	u32 nval;
 
+retry:
 	mutex_lock(&hwmon->hwmon_lock);
 	if (hwmon->ddat.reset_in_progress) {
-		ret = -EAGAIN;
-		goto unlock;
+		mutex_unlock(&hwmon->hwmon_lock);
+		ret = wait_event_interruptible(ddat->wqh,
+					       !hwmon->ddat.reset_in_progress);
+		if (ret)
+			return ret;
+		goto retry;
 	}
 	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
 
@@ -426,7 +432,6 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
 exit:
 	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
-unlock:
 	mutex_unlock(&hwmon->hwmon_lock);
 	return ret;
 }
@@ -508,6 +513,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
 	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
 			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
 	hwmon->ddat.reset_in_progress = false;
+	wake_up_all(&hwmon->ddat.wqh);
 
 	mutex_unlock(&hwmon->hwmon_lock);
 }
@@ -784,6 +790,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
 	ddat->uncore = &i915->uncore;
 	snprintf(ddat->name, sizeof(ddat->name), "i915");
 	ddat->gt_n = -1;
+	init_waitqueue_head(&ddat->wqh);
 
 	for_each_gt(gt, i915, i) {
 		ddat_gt = hwmon->ddat_gt + i;
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  2023-04-06  4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
                   ` (2 preceding siblings ...)
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit
@ 2023-04-06  5:15 ` Patchwork
  2023-04-06  5:32 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  2023-04-06 17:42 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  5 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2023-04-06  5:15 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: Disable PL1 power limit when loading GuC firmware
URL   : https://patchwork.freedesktop.org/series/116172/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:148:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:148:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:150:9: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:150:9: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:154:26: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:154:26: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:156:16: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:156:16: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:156:9: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:156:9: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:174:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:174:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:176:9: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:176:9: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:180:35: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:180:35: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:182:16: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:182:16: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:182:9: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:182:9: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:186:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:186:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:188:9: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:188:9: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:192:35: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:192:35: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:195:16: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:195:16: warning: unreplaced symbol 'oldbit'
+./arch/x86/include/asm/bitops.h:195:9: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:195:9: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:237:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:237:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:239:9: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:239:9: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:66:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:66:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:92:1: warning: unreplaced symbol 'return'
+./arch/x86/include/asm/bitops.h:92:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:100:17: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:100:17: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:100:23: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:100:23: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:100:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:100:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:105:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:105:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:107:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:107:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:108:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:108:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:109:9: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:109:9: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:111:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:111:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:111:14: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:111:14: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:111:20: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:111:20: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:112:17: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:112:17: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:112:23: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:112:23: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:112:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:112:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:121:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:121:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:128:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:128:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:166:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:166:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:168:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:168:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:169:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:169:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:170:9: warning: unreplaced symbol 'val'
+./include/asm-generic/bitops/generic-non-atomic.h:170:9: warning: unreplaced symbol 'val'
+./include/asm-generic/bitops/generic-non-atomic.h:172:19: warning: unreplaced symbol 'val'
+./include/asm-generic/bitops/generic-non-atomic.h:172:19: warning: unreplaced symbol 'val'
+./include/asm-generic/bitops/generic-non-atomic.h:172:25: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:172:25: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:172:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:172:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:28:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:28:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:30:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:30:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:31:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:31:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:33:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:33:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:33:16: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:33:16: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:37:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:37:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:39:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:39:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:40:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:40:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:42:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:42:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:42:16: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:42:16: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:55:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:55:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:57:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:57:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:58:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:58:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:60:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:60:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:60:15: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:60:15: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:73:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:73:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:75:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:75:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:76:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:76:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:77:9: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:77:9: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:79:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:79:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:79:14: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:79:14: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:79:20: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:79:20: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:80:17: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:80:17: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:80:23: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:80:23: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:80:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:80:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:93:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:93:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/generic-non-atomic.h:95:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:95:9: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:96:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:96:9: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:97:9: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:97:9: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:99:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:99:10: warning: unreplaced symbol 'p'
+./include/asm-generic/bitops/generic-non-atomic.h:99:14: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:99:14: warning: unreplaced symbol 'old'
+./include/asm-generic/bitops/generic-non-atomic.h:99:21: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/generic-non-atomic.h:99:21: warning: unreplaced symbol 'mask'
+./include/asm-generic/bitops/instrumented-non-atomic.h:100:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:100:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:112:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:112:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:115:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:115:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:127:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:127:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:130:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:130:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:139:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:139:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:142:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:142:9: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:26:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:26:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:42:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:42:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:58:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:58:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:97:1: warning: unreplaced symbol 'return'
+./include/asm-generic/bitops/instrumented-non-atomic.h:97:1: warning: unreplaced symbol 'return'



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  2023-04-06  4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
                   ` (3 preceding siblings ...)
  2023-04-06  5:15 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware Patchwork
@ 2023-04-06  5:32 ` Patchwork
  2023-04-06 17:42 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  5 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2023-04-06  5:32 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 5734 bytes --]

== Series Details ==

Series: drm/i915/guc: Disable PL1 power limit when loading GuC firmware
URL   : https://patchwork.freedesktop.org/series/116172/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12976 -> Patchwork_116172v1
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/index.html

Participating hosts (37 -> 36)
------------------------------

  Missing    (1): fi-snb-2520m 

Known issues
------------

  Here are the changes found in Patchwork_116172v1 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3@lmem0:
    - bat-dg2-9:          [PASS][1] -> [FAIL][2] ([fdo#103375]) +3 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/bat-dg2-9/igt@gem_exec_suspend@basic-s3@lmem0.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-dg2-9/igt@gem_exec_suspend@basic-s3@lmem0.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-kbl-soraka:      [PASS][3] -> [DMESG-FAIL][4] ([i915#5334] / [i915#7872])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html

  * igt@i915_selftest@live@slpc:
    - bat-rpls-2:         NOTRUN -> [DMESG-FAIL][5] ([i915#6367] / [i915#7913] / [i915#7996])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-rpls-2/igt@i915_selftest@live@slpc.html

  * igt@kms_chamelium_hpd@common-hpd-after-suspend:
    - bat-rpls-2:         NOTRUN -> [SKIP][6] ([i915#7828])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-rpls-2/igt@kms_chamelium_hpd@common-hpd-after-suspend.html
    - bat-rpls-1:         NOTRUN -> [SKIP][7] ([i915#7828])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-rpls-1/igt@kms_chamelium_hpd@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1:
    - bat-dg2-8:          [PASS][8] -> [FAIL][9] ([i915#7932])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1.html

  * igt@kms_pipe_crc_basic@suspend-read-crc:
    - bat-rpls-1:         NOTRUN -> [SKIP][10] ([i915#1845])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-rpls-1/igt@kms_pipe_crc_basic@suspend-read-crc.html
    - bat-rpls-2:         NOTRUN -> [SKIP][11] ([i915#1845])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-rpls-2/igt@kms_pipe_crc_basic@suspend-read-crc.html

  * igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-dp-3:
    - bat-dg2-9:          [PASS][12] -> [FAIL][13] ([fdo#103375] / [i915#7932])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/bat-dg2-9/igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-dp-3.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-dg2-9/igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-dp-3.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s3@smem:
    - bat-rpls-1:         [ABORT][14] ([i915#6687] / [i915#7978]) -> [PASS][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/bat-rpls-1/igt@gem_exec_suspend@basic-s3@smem.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-rpls-1/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@i915_selftest@live@reset:
    - bat-rpls-2:         [ABORT][16] ([i915#4983] / [i915#7913] / [i915#7981]) -> [PASS][17]
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/bat-rpls-2/igt@i915_selftest@live@reset.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/bat-rpls-2/igt@i915_selftest@live@reset.html

  
  [fdo#103375]: https://bugs.freedesktop.org/show_bug.cgi?id=103375
  [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6687]: https://gitlab.freedesktop.org/drm/intel/issues/6687
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#7872]: https://gitlab.freedesktop.org/drm/intel/issues/7872
  [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
  [i915#7932]: https://gitlab.freedesktop.org/drm/intel/issues/7932
  [i915#7978]: https://gitlab.freedesktop.org/drm/intel/issues/7978
  [i915#7981]: https://gitlab.freedesktop.org/drm/intel/issues/7981
  [i915#7996]: https://gitlab.freedesktop.org/drm/intel/issues/7996


Build changes
-------------

  * Linux: CI_DRM_12976 -> Patchwork_116172v1

  CI-20190529: 20190529
  CI_DRM_12976: 517e3daa2f8c38df4969591325325de8315685b9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7240: ef4550e3b7d3c11ba257006bc7d4f8e421667d46 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_116172v1: 517e3daa2f8c38df4969591325325de8315685b9 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

5f9162aad7cf drm/i915/hwmon: Block waiting for GuC reset to complete
8aebbdfa10a0 drm/i915/guc: Disable PL1 power limit when loading GuC firmware
284c2ad65107 drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/index.html

[-- Attachment #2: Type: text/html, Size: 6930 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
@ 2023-04-06  9:16   ` kernel test robot
  2023-04-07 11:08   ` Rodrigo Vivi
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-06  9:16 UTC (permalink / raw)
  To: Ashutosh Dixit, intel-gfx; +Cc: Rodrigo Vivi, llvm, dri-devel, oe-kbuild-all

Hi Ashutosh,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-tip/drm-tip]

url:    https://github.com/intel-lab-lkp/linux/commits/Ashutosh-Dixit/drm-i915-hwmon-Get-mutex-and-rpm-ref-just-once-in-hwm_power_max_write/20230406-124659
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
patch link:    https://lore.kernel.org/r/20230406044522.3108359-3-ashutosh.dixit%40intel.com
patch subject: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware
config: i386-randconfig-a002-20230403 (https://download.01.org/0day-ci/archive/20230406/202304061654.yjntbbxy-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/b4aa935db7f0b46437cdaa39f0149ad835ceb73c
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Ashutosh-Dixit/drm-i915-hwmon-Get-mutex-and-rpm-ref-just-once-in-hwm_power_max_write/20230406-124659
        git checkout b4aa935db7f0b46437cdaa39f0149ad835ceb73c
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304061654.yjntbbxy-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gt/uc/intel_uc.c:484:6: warning: variable 'pl1en' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
           if (ret)
               ^~~
   drivers/gpu/drm/i915/gt/uc/intel_uc.c:573:41: note: uninitialized use occurs here
           i915_hwmon_power_max_restore(gt->i915, pl1en);
                                                  ^~~~~
   drivers/gpu/drm/i915/gt/uc/intel_uc.c:484:2: note: remove the 'if' if its condition is always false
           if (ret)
           ^~~~~~~~
   drivers/gpu/drm/i915/gt/uc/intel_uc.c:475:6: warning: variable 'pl1en' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
           if (!intel_uc_fw_is_loadable(&guc->fw)) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/gt/uc/intel_uc.c:573:41: note: uninitialized use occurs here
           i915_hwmon_power_max_restore(gt->i915, pl1en);
                                                  ^~~~~
   drivers/gpu/drm/i915/gt/uc/intel_uc.c:475:2: note: remove the 'if' if its condition is always false
           if (!intel_uc_fw_is_loadable(&guc->fw)) {
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/gt/uc/intel_uc.c:465:12: note: initialize the variable 'pl1en' to silence this warning
           bool pl1en;
                     ^
                      = 0
   2 warnings generated.


vim +484 drivers/gpu/drm/i915/gt/uc/intel_uc.c

afd088ac05f120d drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2022-01-06  457  
6fbeda0bfd210f9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2020-01-10  458  static int __uc_init_hw(struct intel_uc *uc)
6fbeda0bfd210f9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2020-01-10  459  {
2f8c06cb6622b55 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2023-01-28  460  	struct intel_gt *gt = uc_to_gt(uc);
2f8c06cb6622b55 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2023-01-28  461  	struct drm_i915_private *i915 = gt->i915;
6fbeda0bfd210f9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2020-01-10  462  	struct intel_guc *guc = &uc->guc;
6fbeda0bfd210f9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2020-01-10  463  	struct intel_huc *huc = &uc->huc;
6fbeda0bfd210f9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2020-01-10  464  	int ret, attempts;
b4aa935db7f0b46 drivers/gpu/drm/i915/gt/uc/intel_uc.c Ashutosh Dixit         2023-04-05  465  	bool pl1en;
6fbeda0bfd210f9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2020-01-10  466  
6fbeda0bfd210f9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2020-01-10  467  	GEM_BUG_ON(!intel_uc_supports_guc(uc));
bfe5a40a7b9a967 drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2020-02-18  468  	GEM_BUG_ON(!intel_uc_wants_guc(uc));
356c484822e6ac9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-16  469  
2f8c06cb6622b55 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2023-01-28  470  	print_fw_ver(gt, &guc->fw);
afd088ac05f120d drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2022-01-06  471  
afd088ac05f120d drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2022-01-06  472  	if (intel_uc_uses_huc(uc))
2f8c06cb6622b55 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2023-01-28  473  		print_fw_ver(gt, &huc->fw);
afd088ac05f120d drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2022-01-06  474  
42f96e5bd41e91f drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2020-02-18  475  	if (!intel_uc_fw_is_loadable(&guc->fw)) {
6fbeda0bfd210f9 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2020-01-10  476  		ret = __uc_check_hw(uc) ||
ee402140274e870 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  477  		      intel_uc_fw_is_overridden(&guc->fw) ||
202c98e71692484 drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2020-02-18  478  		      intel_uc_wants_guc_submission(uc) ?
ee402140274e870 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  479  		      intel_uc_fw_status_to_error(guc->fw.status) : 0;
ae7a3166a708bee drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-11  480  		goto err_out;
ae7a3166a708bee drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-11  481  	}
ae7a3166a708bee drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-11  482  
63064d822c964c0 drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2019-07-30  483  	ret = uc_init_wopcm(uc);
63064d822c964c0 drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2019-07-30 @484  	if (ret)
63064d822c964c0 drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2019-07-30  485  		goto err_out;
63064d822c964c0 drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2019-07-30  486  
e5a1ad035938e60 drivers/gpu/drm/i915/gt/uc/intel_uc.c Matthew Brost          2021-07-26  487  	intel_guc_reset_interrupts(guc);
61b5c1587dd82a8 drivers/gpu/drm/i915/intel_uc.c       Michał Winiarski       2017-12-13  488  
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  489  	/* WaEnableuKernelHeaderValidFix:skl */
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  490  	/* WaEnableGuCBootHashCheckNotSet:skl,bxt,kbl */
c816723b6b8a627 drivers/gpu/drm/i915/gt/uc/intel_uc.c Lucas De Marchi        2021-06-05  491  	if (GRAPHICS_VER(i915) == 9)
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  492  		attempts = 3;
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  493  	else
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  494  		attempts = 1;
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  495  
b4aa935db7f0b46 drivers/gpu/drm/i915/gt/uc/intel_uc.c Ashutosh Dixit         2023-04-05  496  	/* Disable a potentially low PL1 power limit to allow freq to be raised */
b4aa935db7f0b46 drivers/gpu/drm/i915/gt/uc/intel_uc.c Ashutosh Dixit         2023-04-05  497  	i915_hwmon_power_max_disable(gt->i915, &pl1en);
b4aa935db7f0b46 drivers/gpu/drm/i915/gt/uc/intel_uc.c Ashutosh Dixit         2023-04-05  498  
1c40d40f6835cde drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-12-16  499  	intel_rps_raise_unslice(&uc_to_gt(uc)->rps);
1c40d40f6835cde drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-12-16  500  
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  501  	while (attempts--) {
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  502  		/*
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  503  		 * Always reset the GuC just before (re)loading, so
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  504  		 * that the state and timing are fairly predictable
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  505  		 */
771051eaa74661f drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-02  506  		ret = __uc_sanitize(uc);
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  507  		if (ret)
61b5c1587dd82a8 drivers/gpu/drm/i915/intel_uc.c       Michał Winiarski       2017-12-13  508  			goto err_out;
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  509  
a8dc0f6d187bccc drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  510  		intel_huc_fw_upload(huc);
386e300fe9fae7e drivers/gpu/drm/i915/intel_uc.c       Michal Wajdeczko       2019-05-27  511  		intel_guc_ads_reset(guc);
2bf8fb39eb70b6c drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2019-07-24  512  		intel_guc_write_params(guc);
e8668bbcb0f91c7 drivers/gpu/drm/i915/intel_uc.c       Michal Wajdeczko       2017-10-16  513  		ret = intel_guc_fw_upload(guc);
52b832606038c5b drivers/gpu/drm/i915/intel_uc.c       Robert M. Fosha        2019-03-29  514  		if (ret == 0)
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  515  			break;
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  516  
2f8c06cb6622b55 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2023-01-28  517  		gt_dbg(gt, "GuC fw load failed (%pe) will reset and retry %d more time(s)\n",
2f8c06cb6622b55 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2023-01-28  518  		       ERR_PTR(ret), attempts);
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  519  	}
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  520  
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  521  	/* Did we succeded or run out of retries? */
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  522  	if (ret)
ac58d2ab0ad9c8b drivers/gpu/drm/i915/intel_uc.c       Daniele Ceraolo Spurio 2017-05-22  523  		goto err_log_capture;
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  524  
789a625158b0c0c drivers/gpu/drm/i915/intel_uc.c       Michal Wajdeczko       2017-05-02  525  	ret = guc_enable_communication(guc);
789a625158b0c0c drivers/gpu/drm/i915/intel_uc.c       Michal Wajdeczko       2017-05-02  526  	if (ret)
ac58d2ab0ad9c8b drivers/gpu/drm/i915/intel_uc.c       Daniele Ceraolo Spurio 2017-05-22  527  		goto err_log_capture;
789a625158b0c0c drivers/gpu/drm/i915/intel_uc.c       Michal Wajdeczko       2017-05-02  528  
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  529  	/*
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  530  	 * GSC-loaded HuC is authenticated by the GSC, so we don't need to
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  531  	 * trigger the auth here. However, given that the HuC loaded this way
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  532  	 * survive GT reset, we still need to update our SW bookkeeping to make
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  533  	 * sure it reflects the correct HW status.
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  534  	 */
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  535  	if (intel_huc_is_loaded_by_gsc(huc))
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  536  		intel_huc_update_auth_status(huc);
6f67930af78f10a drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2022-05-04  537  	else
a8dc0f6d187bccc drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  538  		intel_huc_auth(huc);
0dfa1cee613e03c drivers/gpu/drm/i915/intel_uc.c       Michal Wajdeczko       2017-12-06  539  
cd414f4f59f64d7 drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2023-02-17  540  	if (intel_uc_uses_guc_submission(uc)) {
cd414f4f59f64d7 drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2023-02-17  541  		ret = intel_guc_submission_enable(guc);
cd414f4f59f64d7 drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2023-02-17  542  		if (ret)
cd414f4f59f64d7 drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2023-02-17  543  			goto err_log_capture;
cd414f4f59f64d7 drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2023-02-17  544  	}
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  545  
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  546  	if (intel_uc_uses_guc_slpc(uc)) {
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  547  		ret = intel_guc_slpc_enable(&guc->slpc);
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  548  		if (ret)
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  549  			goto err_submission;
1c40d40f6835cde drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-12-16  550  	} else {
1c40d40f6835cde drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-12-16  551  		/* Restore GT back to RPn for non-SLPC path */
1c40d40f6835cde drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-12-16  552  		intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  553  	}
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  554  
b4aa935db7f0b46 drivers/gpu/drm/i915/gt/uc/intel_uc.c Ashutosh Dixit         2023-04-05  555  	i915_hwmon_power_max_restore(gt->i915, pl1en);
b4aa935db7f0b46 drivers/gpu/drm/i915/gt/uc/intel_uc.c Ashutosh Dixit         2023-04-05  556  
4fd4fde8e42e164 drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2023-02-06  557  	guc_info(guc, "submission %s\n", str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
4fd4fde8e42e164 drivers/gpu/drm/i915/gt/uc/intel_uc.c John Harrison          2023-02-06  558  	guc_info(guc, "SLPC %s\n", str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  559  
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  560  	return 0;
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  561  
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  562  	/*
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  563  	 * We've failed to load the firmware :(
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  564  	 */
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  565  err_submission:
63c0eb30bfe9269 drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-07-30  566  	intel_guc_submission_disable(guc);
ac58d2ab0ad9c8b drivers/gpu/drm/i915/intel_uc.c       Daniele Ceraolo Spurio 2017-05-22  567  err_log_capture:
32ff76e80c2400c drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-02  568  	__uc_capture_load_err_log(uc);
121981fafe699d9 drivers/gpu/drm/i915/intel_uc.c       Michal Wajdeczko       2017-12-06  569  err_out:
1c40d40f6835cde drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-12-16  570  	/* Return GT back to RPn */
1c40d40f6835cde drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-12-16  571  	intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
1c40d40f6835cde drivers/gpu/drm/i915/gt/uc/intel_uc.c Vinay Belgaumkar       2021-12-16  572  
b4aa935db7f0b46 drivers/gpu/drm/i915/gt/uc/intel_uc.c Ashutosh Dixit         2023-04-05  573  	i915_hwmon_power_max_restore(gt->i915, pl1en);
b4aa935db7f0b46 drivers/gpu/drm/i915/gt/uc/intel_uc.c Ashutosh Dixit         2023-04-05  574  
ca7b2c1bbede618 drivers/gpu/drm/i915/gt/uc/intel_uc.c Daniele Ceraolo Spurio 2019-07-13  575  	__uc_sanitize(uc);
89195bab5d8c540 drivers/gpu/drm/i915/intel_uc.c       Michal Wajdeczko       2019-05-22  576  
ee402140274e870 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  577  	if (!ret) {
2f8c06cb6622b55 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2023-01-28  578  		gt_notice(gt, "GuC is uninitialized\n");
ee402140274e870 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  579  		/* We want to run without GuC submission */
ee402140274e870 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  580  		return 0;
ee402140274e870 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  581  	}
ee402140274e870 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-18  582  
2f8c06cb6622b55 drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2023-01-28  583  	gt_probe_error(gt, "GuC initialization failed %pe\n", ERR_PTR(ret));
a5f978c3609f02a drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-11  584  
a5f978c3609f02a drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-11  585  	/* We want to keep KMS alive */
a5f978c3609f02a drivers/gpu/drm/i915/gt/uc/intel_uc.c Michal Wajdeczko       2019-08-11  586  	return -EIO;
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  587  }
6cd5a72c357732e drivers/gpu/drm/i915/intel_uc.c       Arkadiusz Hiler        2017-03-14  588  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  2023-04-06  4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
                   ` (4 preceding siblings ...)
  2023-04-06  5:32 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2023-04-06 17:42 ` Patchwork
  5 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2023-04-06 17:42 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 15187 bytes --]

== Series Details ==

Series: drm/i915/guc: Disable PL1 power limit when loading GuC firmware
URL   : https://patchwork.freedesktop.org/series/116172/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12976_full -> Patchwork_116172v1_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (7 -> 7)
------------------------------

  No changes in participating hosts

Known issues
------------

  Here are the changes found in Patchwork_116172v1_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
    - shard-apl:          [PASS][1] -> [FAIL][2] ([i915#2842])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-apl3/igt@gem_exec_fair@basic-pace-solo@rcs0.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl4/igt@gem_exec_fair@basic-pace-solo@rcs0.html

  * igt@gem_lmem_swapping@heavy-verify-multi:
    - shard-apl:          NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#4613])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl7/igt@gem_lmem_swapping@heavy-verify-multi.html
    - shard-glk:          NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#4613])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-glk7/igt@gem_lmem_swapping@heavy-verify-multi.html

  * igt@kms_ccs@pipe-c-crc-primary-basic-y_tiled_gen12_mc_ccs:
    - shard-apl:          NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#3886]) +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl6/igt@kms_ccs@pipe-c-crc-primary-basic-y_tiled_gen12_mc_ccs.html

  * igt@kms_cursor_crc@cursor-sliding-512x170:
    - shard-glk:          NOTRUN -> [SKIP][6] ([fdo#109271]) +20 similar issues
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-glk3/igt@kms_cursor_crc@cursor-sliding-512x170.html

  * igt@kms_fbcon_fbt@fbc-suspend:
    - shard-apl:          [PASS][7] -> [FAIL][8] ([i915#4767])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-apl1/igt@kms_fbcon_fbt@fbc-suspend.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl6/igt@kms_fbcon_fbt@fbc-suspend.html

  * igt@kms_plane_scaling@planes-upscale-20x20-downscale-factor-0-75@pipe-c-dp-1:
    - shard-apl:          NOTRUN -> [SKIP][9] ([fdo#109271]) +40 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl6/igt@kms_plane_scaling@planes-upscale-20x20-downscale-factor-0-75@pipe-c-dp-1.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area:
    - shard-glk:          NOTRUN -> [SKIP][10] ([fdo#109271] / [i915#658])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-glk7/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area.html

  * igt@kms_psr2_su@page_flip-p010:
    - shard-apl:          NOTRUN -> [SKIP][11] ([fdo#109271] / [i915#658]) +1 similar issue
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl6/igt@kms_psr2_su@page_flip-p010.html

  
#### Possible fixes ####

  * igt@drm_fdinfo@most-busy-idle-check-all@rcs0:
    - {shard-rkl}:        [FAIL][12] ([i915#7742]) -> [PASS][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-rkl-3/igt@drm_fdinfo@most-busy-idle-check-all@rcs0.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-rkl-4/igt@drm_fdinfo@most-busy-idle-check-all@rcs0.html

  * igt@gem_ctx_exec@basic-nohangcheck:
    - {shard-tglu}:       [FAIL][14] ([i915#6268]) -> [PASS][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-tglu-6/igt@gem_ctx_exec@basic-nohangcheck.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-tglu-8/igt@gem_ctx_exec@basic-nohangcheck.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - {shard-rkl}:        [FAIL][16] ([i915#2842]) -> [PASS][17]
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-rkl-3/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-rkl-4/igt@gem_exec_fair@basic-pace-share@rcs0.html
    - shard-apl:          [FAIL][18] ([i915#2842]) -> [PASS][19]
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-apl7/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl2/igt@gem_exec_fair@basic-pace-share@rcs0.html
    - {shard-tglu}:       [FAIL][20] ([i915#2842]) -> [PASS][21]
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-tglu-6/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-tglu-10/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-apl:          [ABORT][22] ([i915#5566]) -> [PASS][23]
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-apl6/igt@gen9_exec_parse@allowed-single.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl7/igt@gen9_exec_parse@allowed-single.html
    - shard-glk:          [ABORT][24] ([i915#5566]) -> [PASS][25]
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-glk7/igt@gen9_exec_parse@allowed-single.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-glk7/igt@gen9_exec_parse@allowed-single.html

  * igt@i915_pm_dc@dc9-dpms:
    - {shard-tglu}:       [SKIP][26] ([i915#4281]) -> [PASS][27]
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-tglu-7/igt@i915_pm_dc@dc9-dpms.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-tglu-2/igt@i915_pm_dc@dc9-dpms.html

  * igt@i915_pm_rpm@dpms-mode-unset-non-lpsp:
    - {shard-rkl}:        [SKIP][28] ([i915#1397]) -> [PASS][29] +1 similar issue
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-rkl-7/igt@i915_pm_rpm@dpms-mode-unset-non-lpsp.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-rkl-3/igt@i915_pm_rpm@dpms-mode-unset-non-lpsp.html

  * igt@i915_pm_rpm@modeset-lpsp-stress-no-wait:
    - {shard-dg1}:        [SKIP][30] ([i915#1397]) -> [PASS][31]
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-dg1-17/igt@i915_pm_rpm@modeset-lpsp-stress-no-wait.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-dg1-14/igt@i915_pm_rpm@modeset-lpsp-stress-no-wait.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-apl:          [FAIL][32] ([i915#2346]) -> [PASS][33]
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-apl6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl7/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_cursor_legacy@single-bo@pipe-b:
    - {shard-rkl}:        [INCOMPLETE][34] ([i915#8011]) -> [PASS][35]
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-rkl-7/igt@kms_cursor_legacy@single-bo@pipe-b.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-rkl-4/igt@kms_cursor_legacy@single-bo@pipe-b.html

  * igt@kms_flip@2x-plain-flip-ts-check-interruptible@ab-hdmi-a1-hdmi-a2:
    - shard-glk:          [FAIL][36] ([i915#2122]) -> [PASS][37]
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-glk7/igt@kms_flip@2x-plain-flip-ts-check-interruptible@ab-hdmi-a1-hdmi-a2.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-glk7/igt@kms_flip@2x-plain-flip-ts-check-interruptible@ab-hdmi-a1-hdmi-a2.html

  * igt@kms_flip@flip-vs-suspend-interruptible@b-dp1:
    - shard-apl:          [DMESG-WARN][38] -> [PASS][39]
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-apl7/igt@kms_flip@flip-vs-suspend-interruptible@b-dp1.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl3/igt@kms_flip@flip-vs-suspend-interruptible@b-dp1.html

  * igt@kms_flip@flip-vs-suspend@a-dp1:
    - shard-apl:          [ABORT][40] ([i915#180]) -> [PASS][41]
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12976/shard-apl1/igt@kms_flip@flip-vs-suspend@a-dp1.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/shard-apl6/igt@kms_flip@flip-vs-suspend@a-dp1.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109289]: https://bugs.freedesktop.org/show_bug.cgi?id=109289
  [fdo#109307]: https://bugs.freedesktop.org/show_bug.cgi?id=109307
  [fdo#109506]: https://bugs.freedesktop.org/show_bug.cgi?id=109506
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1839]: https://gitlab.freedesktop.org/drm/intel/issues/1839
  [i915#1902]: https://gitlab.freedesktop.org/drm/intel/issues/1902
  [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2527]: https://gitlab.freedesktop.org/drm/intel/issues/2527
  [i915#2575]: https://gitlab.freedesktop.org/drm/intel/issues/2575
  [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#280]: https://gitlab.freedesktop.org/drm/intel/issues/280
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#315]: https://gitlab.freedesktop.org/drm/intel/issues/315
  [i915#3281]: https://gitlab.freedesktop.org/drm/intel/issues/3281
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3297]: https://gitlab.freedesktop.org/drm/intel/issues/3297
  [i915#3359]: https://gitlab.freedesktop.org/drm/intel/issues/3359
  [i915#3458]: https://gitlab.freedesktop.org/drm/intel/issues/3458
  [i915#3469]: https://gitlab.freedesktop.org/drm/intel/issues/3469
  [i915#3539]: https://gitlab.freedesktop.org/drm/intel/issues/3539
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3638]: https://gitlab.freedesktop.org/drm/intel/issues/3638
  [i915#3689]: https://gitlab.freedesktop.org/drm/intel/issues/3689
  [i915#3743]: https://gitlab.freedesktop.org/drm/intel/issues/3743
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#3952]: https://gitlab.freedesktop.org/drm/intel/issues/3952
  [i915#3989]: https://gitlab.freedesktop.org/drm/intel/issues/3989
  [i915#4070]: https://gitlab.freedesktop.org/drm/intel/issues/4070
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4078]: https://gitlab.freedesktop.org/drm/intel/issues/4078
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4098]: https://gitlab.freedesktop.org/drm/intel/issues/4098
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212
  [i915#4213]: https://gitlab.freedesktop.org/drm/intel/issues/4213
  [i915#4215]: https://gitlab.freedesktop.org/drm/intel/issues/4215
  [i915#4270]: https://gitlab.freedesktop.org/drm/intel/issues/4270
  [i915#4281]: https://gitlab.freedesktop.org/drm/intel/issues/4281
  [i915#4349]: https://gitlab.freedesktop.org/drm/intel/issues/4349
  [i915#4538]: https://gitlab.freedesktop.org/drm/intel/issues/4538
  [i915#454]: https://gitlab.freedesktop.org/drm/intel/issues/454
  [i915#4565]: https://gitlab.freedesktop.org/drm/intel/issues/4565
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4767]: https://gitlab.freedesktop.org/drm/intel/issues/4767
  [i915#4812]: https://gitlab.freedesktop.org/drm/intel/issues/4812
  [i915#4816]: https://gitlab.freedesktop.org/drm/intel/issues/4816
  [i915#4818]: https://gitlab.freedesktop.org/drm/intel/issues/4818
  [i915#4833]: https://gitlab.freedesktop.org/drm/intel/issues/4833
  [i915#4852]: https://gitlab.freedesktop.org/drm/intel/issues/4852
  [i915#4854]: https://gitlab.freedesktop.org/drm/intel/issues/4854
  [i915#4859]: https://gitlab.freedesktop.org/drm/intel/issues/4859
  [i915#4860]: https://gitlab.freedesktop.org/drm/intel/issues/4860
  [i915#4880]: https://gitlab.freedesktop.org/drm/intel/issues/4880
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235
  [i915#5286]: https://gitlab.freedesktop.org/drm/intel/issues/5286
  [i915#5563]: https://gitlab.freedesktop.org/drm/intel/issues/5563
  [i915#5566]: https://gitlab.freedesktop.org/drm/intel/issues/5566
  [i915#6095]: https://gitlab.freedesktop.org/drm/intel/issues/6095
  [i915#6227]: https://gitlab.freedesktop.org/drm/intel/issues/6227
  [i915#6268]: https://gitlab.freedesktop.org/drm/intel/issues/6268
  [i915#6301]: https://gitlab.freedesktop.org/drm/intel/issues/6301
  [i915#6524]: https://gitlab.freedesktop.org/drm/intel/issues/6524
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#7116]: https://gitlab.freedesktop.org/drm/intel/issues/7116
  [i915#7561]: https://gitlab.freedesktop.org/drm/intel/issues/7561
  [i915#7697]: https://gitlab.freedesktop.org/drm/intel/issues/7697
  [i915#7701]: https://gitlab.freedesktop.org/drm/intel/issues/7701
  [i915#7711]: https://gitlab.freedesktop.org/drm/intel/issues/7711
  [i915#7742]: https://gitlab.freedesktop.org/drm/intel/issues/7742
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#8011]: https://gitlab.freedesktop.org/drm/intel/issues/8011
  [i915#8253]: https://gitlab.freedesktop.org/drm/intel/issues/8253
  [i915#8292]: https://gitlab.freedesktop.org/drm/intel/issues/8292
  [i915#8308]: https://gitlab.freedesktop.org/drm/intel/issues/8308


Build changes
-------------

  * Linux: CI_DRM_12976 -> Patchwork_116172v1

  CI-20190529: 20190529
  CI_DRM_12976: 517e3daa2f8c38df4969591325325de8315685b9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7240: ef4550e3b7d3c11ba257006bc7d4f8e421667d46 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_116172v1: 517e3daa2f8c38df4969591325325de8315685b9 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116172v1/index.html

[-- Attachment #2: Type: text/html, Size: 12335 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit
@ 2023-04-07 11:04   ` Rodrigo Vivi
  2023-04-10 22:40     ` Dixit, Ashutosh
  0 siblings, 1 reply; 25+ messages in thread
From: Rodrigo Vivi @ 2023-04-07 11:04 UTC (permalink / raw)
  To: Ashutosh Dixit; +Cc: intel-gfx, dri-devel

On Wed, Apr 05, 2023 at 09:45:22PM -0700, Ashutosh Dixit wrote:
> Instead of erroring out when GuC reset is in progress, block waiting for
> GuC reset to complete which is a more reasonable uapi behavior.
> 
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_hwmon.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> index 9ab8971679fe3..4343efb48e61b 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.c
> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> @@ -51,6 +51,7 @@ struct hwm_drvdata {
>  	char name[12];
>  	int gt_n;
>  	bool reset_in_progress;
> +	wait_queue_head_t wqh;
>  };
>  
>  struct i915_hwmon {
> @@ -400,10 +401,15 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>  	int ret = 0;
>  	u32 nval;
>  
> +retry:
>  	mutex_lock(&hwmon->hwmon_lock);
>  	if (hwmon->ddat.reset_in_progress) {
> -		ret = -EAGAIN;
> -		goto unlock;
> +		mutex_unlock(&hwmon->hwmon_lock);
> +		ret = wait_event_interruptible(ddat->wqh,
> +					       !hwmon->ddat.reset_in_progress);

this is indeed very clever!
maybe just use the timeout version to be on the safeside and then return the
-EAGAIN on timeout?

my fear is probably due to the lack of knowledge on this wait queue, but
I'm wondering what could go wrong if due to some funny race you enter this
check right after wake_up_all below has passed and then you will be here
indefinitely waiting...

> +		if (ret)
> +			return ret;
> +		goto retry;
>  	}
>  	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
>  
> @@ -426,7 +432,6 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>  			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
>  exit:
>  	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> -unlock:
>  	mutex_unlock(&hwmon->hwmon_lock);
>  	return ret;
>  }
> @@ -508,6 +513,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
>  	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
>  			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
>  	hwmon->ddat.reset_in_progress = false;
> +	wake_up_all(&hwmon->ddat.wqh);
>  
>  	mutex_unlock(&hwmon->hwmon_lock);
>  }
> @@ -784,6 +790,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
>  	ddat->uncore = &i915->uncore;
>  	snprintf(ddat->name, sizeof(ddat->name), "i915");
>  	ddat->gt_n = -1;
> +	init_waitqueue_head(&ddat->wqh);
>  
>  	for_each_gt(gt, i915, i) {
>  		ddat_gt = hwmon->ddat_gt + i;
> -- 
> 2.38.0
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
  2023-04-06  9:16   ` kernel test robot
@ 2023-04-07 11:08   ` Rodrigo Vivi
  2023-04-10 22:17     ` Dixit, Ashutosh
  1 sibling, 1 reply; 25+ messages in thread
From: Rodrigo Vivi @ 2023-04-07 11:08 UTC (permalink / raw)
  To: Ashutosh Dixit; +Cc: intel-gfx, dri-devel

On Wed, Apr 05, 2023 at 09:45:21PM -0700, Ashutosh Dixit wrote:
> On dGfx, the PL1 power limit being enabled and set to a low value results
> in a low GPU operating freq. It also negates the freq raise operation which
> is done before GuC firmware load. As a result GuC firmware load can time
> out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
> limit was enabled and set to a low value). Therefore disable the PL1 power
> limit when allowed by HW when loading GuC firmware.
> 
> v2:
>  - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
>  - Add hwm_power_max_restore to error return code path
> 
> v3 (Jani N):
>  - Add/remove explanatory comments
>  - Function renames
>  - Type corrections
>  - Locking annotation
> 
> v4:
>  - Don't hold the lock across GuC reset (Rodrigo)
>  - New locking scheme (suggested by Rodrigo)
>  - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko)
> 
> Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 ++++++
>  drivers/gpu/drm/i915/i915_hwmon.c     | 40 +++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_hwmon.h     |  7 +++++
>  3 files changed, 56 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> index 4ccb4be4c9cba..aa8e35a5636a0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> @@ -18,6 +18,7 @@
>  #include "intel_uc.h"
>  
>  #include "i915_drv.h"
> +#include "i915_hwmon.h"
>  
>  static const struct intel_uc_ops uc_ops_off;
>  static const struct intel_uc_ops uc_ops_on;
> @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
>  	struct intel_guc *guc = &uc->guc;
>  	struct intel_huc *huc = &uc->huc;
>  	int ret, attempts;
> +	bool pl1en;

we need to initialize this to make warn free builds happy...
what's our default btw? false? true? we need to read it back?

>  
>  	GEM_BUG_ON(!intel_uc_supports_guc(uc));
>  	GEM_BUG_ON(!intel_uc_wants_guc(uc));
> @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
>  	else
>  		attempts = 1;
>  
> +	/* Disable a potentially low PL1 power limit to allow freq to be raised */
> +	i915_hwmon_power_max_disable(gt->i915, &pl1en);
> +
>  	intel_rps_raise_unslice(&uc_to_gt(uc)->rps);
>  
>  	while (attempts--) {
> @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
>  		intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
>  	}
>  
> +	i915_hwmon_power_max_restore(gt->i915, pl1en);
> +
>  	guc_info(guc, "submission %s\n", str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
>  	guc_info(guc, "SLPC %s\n", str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
>  
> @@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
>  	/* Return GT back to RPn */
>  	intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
>  
> +	i915_hwmon_power_max_restore(gt->i915, pl1en);
> +
>  	__uc_sanitize(uc);
>  
>  	if (!ret) {
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> index 7f44e809ca155..9ab8971679fe3 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.c
> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> @@ -50,6 +50,7 @@ struct hwm_drvdata {
>  	struct hwm_energy_info ei;		/*  Energy info for energy1_input */
>  	char name[12];
>  	int gt_n;
> +	bool reset_in_progress;
>  };
>  
>  struct i915_hwmon {
> @@ -400,6 +401,10 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>  	u32 nval;
>  
>  	mutex_lock(&hwmon->hwmon_lock);
> +	if (hwmon->ddat.reset_in_progress) {
> +		ret = -EAGAIN;
> +		goto unlock;
> +	}
>  	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
>  
>  	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> @@ -421,6 +426,7 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>  			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
>  exit:
>  	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> +unlock:
>  	mutex_unlock(&hwmon->hwmon_lock);
>  	return ret;
>  }
> @@ -472,6 +478,40 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, int chan, long val)
>  	}
>  }
>  
> +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old)
> +{
> +	struct i915_hwmon *hwmon = i915->hwmon;
> +	u32 r;
> +
> +	if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> +		return;
> +
> +	mutex_lock(&hwmon->hwmon_lock);
> +
> +	hwmon->ddat.reset_in_progress = true;
> +	r = intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> +			     PKG_PWR_LIM_1_EN, 0);
> +	*old = !!(r & PKG_PWR_LIM_1_EN);
> +
> +	mutex_unlock(&hwmon->hwmon_lock);
> +}
> +
> +void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> +{
> +	struct i915_hwmon *hwmon = i915->hwmon;
> +
> +	if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> +		return;
> +
> +	mutex_lock(&hwmon->hwmon_lock);
> +
> +	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> +			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> +	hwmon->ddat.reset_in_progress = false;
> +
> +	mutex_unlock(&hwmon->hwmon_lock);
> +}

you could have combined both functions in a
i915_hwmon_power_max_set(struct drm_i915_private *i915, bool val, bool *old)

then pass NULL to old on the restoration times
and have
    if (old)
       	*old = !!(r & PKG_PWR_LIM_1_EN);

But really up to you here, the current code is clear to follow imho
so, with the pl1en initialization fixed:

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> +
>  static umode_t
>  hwm_energy_is_visible(const struct hwm_drvdata *ddat, u32 attr)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.h b/drivers/gpu/drm/i915/i915_hwmon.h
> index 7ca9cf2c34c96..0fcb7de844061 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.h
> +++ b/drivers/gpu/drm/i915/i915_hwmon.h
> @@ -7,14 +7,21 @@
>  #ifndef __I915_HWMON_H__
>  #define __I915_HWMON_H__
>  
> +#include <linux/types.h>
> +
>  struct drm_i915_private;
> +struct intel_gt;
>  
>  #if IS_REACHABLE(CONFIG_HWMON)
>  void i915_hwmon_register(struct drm_i915_private *i915);
>  void i915_hwmon_unregister(struct drm_i915_private *i915);
> +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old);
> +void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old);
>  #else
>  static inline void i915_hwmon_register(struct drm_i915_private *i915) { };
>  static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { };
> +static inline void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old) { };
> +static inline void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) { };
>  #endif
>  
>  #endif /* __I915_HWMON_H__ */
> -- 
> 2.38.0
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write
  2023-04-06  4:45 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit
@ 2023-04-07 11:08   ` Rodrigo Vivi
  0 siblings, 0 replies; 25+ messages in thread
From: Rodrigo Vivi @ 2023-04-07 11:08 UTC (permalink / raw)
  To: Ashutosh Dixit; +Cc: intel-gfx, dri-devel

On Wed, Apr 05, 2023 at 09:45:20PM -0700, Ashutosh Dixit wrote:
> In preparation for follow-on patches, refactor hwm_power_max_write to take
> hwmon_lock and runtime pm wakeref at start of the function and release them
> at the end, therefore acquiring these just once each.
> 
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_hwmon.c | 28 +++++++++++++++-------------
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> index 8e7dccc8d3a0e..7f44e809ca155 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.c
> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> @@ -396,31 +396,33 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>  {
>  	struct i915_hwmon *hwmon = ddat->hwmon;
>  	intel_wakeref_t wakeref;
> +	int ret = 0;
>  	u32 nval;
>  
> +	mutex_lock(&hwmon->hwmon_lock);
> +	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> +
>  	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
>  	if (val == PL1_DISABLE) {
> -		mutex_lock(&hwmon->hwmon_lock);
> -		with_intel_runtime_pm(ddat->uncore->rpm, wakeref) {
> -			intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
> -					 PKG_PWR_LIM_1_EN, 0);
> -			nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
> -		}
> -		mutex_unlock(&hwmon->hwmon_lock);
> +		intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
> +				 PKG_PWR_LIM_1_EN, 0);
> +		nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
>  
>  		if (nval & PKG_PWR_LIM_1_EN)
> -			return -ENODEV;
> -		return 0;
> +			ret = -ENODEV;
> +		goto exit;
>  	}
>  
>  	/* Computation in 64-bits to avoid overflow. Round to nearest. */
>  	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
>  	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
>  
> -	hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
> -					    PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1,
> -					    nval);
> -	return 0;
> +	intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit,
> +			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> +exit:
> +	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> +	mutex_unlock(&hwmon->hwmon_lock);
> +	return ret;
>  }
>  
>  static int
> -- 
> 2.38.0
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware
  2023-04-07 11:08   ` Rodrigo Vivi
@ 2023-04-10 22:17     ` Dixit, Ashutosh
  0 siblings, 0 replies; 25+ messages in thread
From: Dixit, Ashutosh @ 2023-04-10 22:17 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel

On Fri, 07 Apr 2023 04:08:31 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Wed, Apr 05, 2023 at 09:45:21PM -0700, Ashutosh Dixit wrote:
> > On dGfx, the PL1 power limit being enabled and set to a low value results
> > in a low GPU operating freq. It also negates the freq raise operation which
> > is done before GuC firmware load. As a result GuC firmware load can time
> > out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
> > limit was enabled and set to a low value). Therefore disable the PL1 power
> > limit when allowed by HW when loading GuC firmware.
> >
> > v2:
> >  - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
> >  - Add hwm_power_max_restore to error return code path
> >
> > v3 (Jani N):
> >  - Add/remove explanatory comments
> >  - Function renames
> >  - Type corrections
> >  - Locking annotation
> >
> > v4:
> >  - Don't hold the lock across GuC reset (Rodrigo)
> >  - New locking scheme (suggested by Rodrigo)
> >  - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko)
> >
> > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 ++++++
> >  drivers/gpu/drm/i915/i915_hwmon.c     | 40 +++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/i915_hwmon.h     |  7 +++++
> >  3 files changed, 56 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > index 4ccb4be4c9cba..aa8e35a5636a0 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > @@ -18,6 +18,7 @@
> >  #include "intel_uc.h"
> >
> >  #include "i915_drv.h"
> > +#include "i915_hwmon.h"
> >
> >  static const struct intel_uc_ops uc_ops_off;
> >  static const struct intel_uc_ops uc_ops_on;
> > @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> >	struct intel_guc *guc = &uc->guc;
> >	struct intel_huc *huc = &uc->huc;
> >	int ret, attempts;
> > +	bool pl1en;
>
> we need to initialize this to make warn free builds happy...
> what's our default btw? false? true? we need to read it back?

Yes this was a real bug caught by the kernel build robot. We don't know the
default till we read it back, which would mean exposing a new function. I
have avoided exposing the new function, i.e. I have fixed this by creating a
new (err_rps) label which will make sure that the variable is not used
unless it is initialized. I am not expecting to see warnings from the build
robot with this fix now.

> >
> >	GEM_BUG_ON(!intel_uc_supports_guc(uc));
> >	GEM_BUG_ON(!intel_uc_wants_guc(uc));
> > @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> >	else
> >		attempts = 1;
> >
> > +	/* Disable a potentially low PL1 power limit to allow freq to be raised */
> > +	i915_hwmon_power_max_disable(gt->i915, &pl1en);
> > +
> >	intel_rps_raise_unslice(&uc_to_gt(uc)->rps);
> >
> >	while (attempts--) {
> > @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> >		intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
> >	}
> >
> > +	i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> >	guc_info(guc, "submission %s\n", str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
> >	guc_info(guc, "SLPC %s\n", str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
> >
> > @@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> >	/* Return GT back to RPn */
> >	intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
> >
> > +	i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> >	__uc_sanitize(uc);
> >
> >	if (!ret) {
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 7f44e809ca155..9ab8971679fe3 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -50,6 +50,7 @@ struct hwm_drvdata {
> >	struct hwm_energy_info ei;		/*  Energy info for energy1_input */
> >	char name[12];
> >	int gt_n;
> > +	bool reset_in_progress;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -400,6 +401,10 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >	u32 nval;
> >
> >	mutex_lock(&hwmon->hwmon_lock);
> > +	if (hwmon->ddat.reset_in_progress) {
> > +		ret = -EAGAIN;
> > +		goto unlock;
> > +	}
> >	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> >	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> > @@ -421,6 +426,7 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> >  exit:
> >	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> > +unlock:
> >	mutex_unlock(&hwmon->hwmon_lock);
> >	return ret;
> >  }
> > @@ -472,6 +478,40 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, int chan, long val)
> >	}
> >  }
> >
> > +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old)
> > +{
> > +	struct i915_hwmon *hwmon = i915->hwmon;
> > +	u32 r;
> > +
> > +	if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > +		return;
> > +
> > +	mutex_lock(&hwmon->hwmon_lock);
> > +
> > +	hwmon->ddat.reset_in_progress = true;
> > +	r = intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > +			     PKG_PWR_LIM_1_EN, 0);
> > +	*old = !!(r & PKG_PWR_LIM_1_EN);
> > +
> > +	mutex_unlock(&hwmon->hwmon_lock);
> > +}
> > +
> > +void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> > +{
> > +	struct i915_hwmon *hwmon = i915->hwmon;
> > +
> > +	if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > +		return;
> > +
> > +	mutex_lock(&hwmon->hwmon_lock);
> > +
> > +	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > +			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > +	hwmon->ddat.reset_in_progress = false;
> > +
> > +	mutex_unlock(&hwmon->hwmon_lock);
> > +}
>
> you could have combined both functions in a
> i915_hwmon_power_max_set(struct drm_i915_private *i915, bool val, bool *old)
>
> then pass NULL to old on the restoration times
> and have
>     if (old)
>		*old = !!(r & PKG_PWR_LIM_1_EN);
>
> But really up to you here, the current code is clear to follow imho
> so, with the pl1en initialization fixed:

Yes, left this as is.

>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Have retained the R-b since the fix in __uc_init_hw is minor.

Thanks!
Ashutosh

> > +
> >  static umode_t
> >  hwm_energy_is_visible(const struct hwm_drvdata *ddat, u32 attr)
> >  {
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.h b/drivers/gpu/drm/i915/i915_hwmon.h
> > index 7ca9cf2c34c96..0fcb7de844061 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.h
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.h
> > @@ -7,14 +7,21 @@
> >  #ifndef __I915_HWMON_H__
> >  #define __I915_HWMON_H__
> >
> > +#include <linux/types.h>
> > +
> >  struct drm_i915_private;
> > +struct intel_gt;
> >
> >  #if IS_REACHABLE(CONFIG_HWMON)
> >  void i915_hwmon_register(struct drm_i915_private *i915);
> >  void i915_hwmon_unregister(struct drm_i915_private *i915);
> > +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old);
> > +void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old);
> >  #else
> >  static inline void i915_hwmon_register(struct drm_i915_private *i915) { };
> >  static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { };
> > +static inline void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old) { };
> > +static inline void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) { };
> >  #endif
> >
> >  #endif /* __I915_HWMON_H__ */
> > --
> > 2.38.0
> >

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-07 11:04   ` Rodrigo Vivi
@ 2023-04-10 22:40     ` Dixit, Ashutosh
  0 siblings, 0 replies; 25+ messages in thread
From: Dixit, Ashutosh @ 2023-04-10 22:40 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel

On Fri, 07 Apr 2023 04:04:06 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Wed, Apr 05, 2023 at 09:45:22PM -0700, Ashutosh Dixit wrote:
> > Instead of erroring out when GuC reset is in progress, block waiting for
> > GuC reset to complete which is a more reasonable uapi behavior.
> >
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_hwmon.c | 13 ++++++++++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9ab8971679fe3..4343efb48e61b 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> >	char name[12];
> >	int gt_n;
> >	bool reset_in_progress;
> > +	wait_queue_head_t wqh;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -400,10 +401,15 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >	int ret = 0;
> >	u32 nval;
> >
> > +retry:
> >	mutex_lock(&hwmon->hwmon_lock);
> >	if (hwmon->ddat.reset_in_progress) {
> > -		ret = -EAGAIN;
> > -		goto unlock;
> > +		mutex_unlock(&hwmon->hwmon_lock);
> > +		ret = wait_event_interruptible(ddat->wqh,
> > +					       !hwmon->ddat.reset_in_progress);
>
> this is indeed very clever!

Not clever, see below :/

> my fear is probably due to the lack of knowledge on this wait queue, but
> I'm wondering what could go wrong if due to some funny race you enter this
> check right after wake_up_all below has passed and then you will be here
> indefinitely waiting...

You are absolutely right, there is indeed a race in the patch because in
the above code when we drop the mutex (mutex_unlock) the wake_up_all can
happen before we have queued ourselves for the wake up.

Solving this race needs a more complicated prepare_to_wait/finish_wait
sequence which I have gone ahead and implemented in patch v2. The v2 code
is also a standard code pattern and the pattern I have implemented is
basically the same as that in intel_guc_wait_for_pending_msg() in i915
which I liked.

I have read in several places (e.g. in the Advanced Sleeping section in
https://static.lwn.net/images/pdf/LDD3/ch06.pdf and in kernel documentation
for try_to_wake_up()) that this sequence will avoid the race (between
schedule() and wake_up()). The crucial difference from the v1 patch is that
in v2 the mutex is dropped after we queue ourselves in prepare_to_wait()
just before calling schedule_timeout().

> maybe just use the timeout version to be on the safeside and then return the
> -EAGAIN on timeout?

Also incorporated timeout in the new version. All code paths in the new
patch have been tested.

Thanks.
--
Ashutosh

> > +		if (ret)
> > +			return ret;
> > +		goto retry;
> >	}
> >	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> > @@ -426,7 +432,6 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> >  exit:
> >	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> > -unlock:
> >	mutex_unlock(&hwmon->hwmon_lock);
> >	return ret;
> >  }
> > @@ -508,6 +513,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> >	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> >			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> >	hwmon->ddat.reset_in_progress = false;
> > +	wake_up_all(&hwmon->ddat.wqh);
> >
> >	mutex_unlock(&hwmon->hwmon_lock);
> >  }
> > @@ -784,6 +790,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> >	ddat->uncore = &i915->uncore;
> >	snprintf(ddat->name, sizeof(ddat->name), "i915");
> >	ddat->gt_n = -1;
> > +	init_waitqueue_head(&ddat->wqh);
> >
> >	for_each_gt(gt, i915, i) {
> >		ddat_gt = hwmon->ddat_gt + i;
> > --
> > 2.38.0
> >

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
@ 2023-04-20 16:40 ` Ashutosh Dixit
  0 siblings, 0 replies; 25+ messages in thread
From: Ashutosh Dixit @ 2023-04-20 16:40 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi

Instead of erroring out when GuC reset is in progress, block waiting for
GuC reset to complete which is a more reasonable uapi behavior.

v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
v3: Remove timeout when blocked (Tvrtko)

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_hwmon.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
index 9ab8971679fe3..a3bdd9f68a458 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -51,6 +51,7 @@ struct hwm_drvdata {
 	char name[12];
 	int gt_n;
 	bool reset_in_progress;
+	wait_queue_head_t waitq;
 };
 
 struct i915_hwmon {
@@ -397,14 +398,32 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 {
 	struct i915_hwmon *hwmon = ddat->hwmon;
 	intel_wakeref_t wakeref;
+	DEFINE_WAIT(wait);
 	int ret = 0;
 	u32 nval;
 
-	mutex_lock(&hwmon->hwmon_lock);
-	if (hwmon->ddat.reset_in_progress) {
-		ret = -EAGAIN;
-		goto unlock;
+	/* Block waiting for GuC reset to complete when needed */
+	for (;;) {
+		mutex_lock(&hwmon->hwmon_lock);
+
+		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
+
+		if (!hwmon->ddat.reset_in_progress)
+			break;
+
+		if (signal_pending(current)) {
+			ret = -EINTR;
+			break;
+		}
+
+		mutex_unlock(&hwmon->hwmon_lock);
+
+		schedule();
 	}
+	finish_wait(&ddat->waitq, &wait);
+	if (ret)
+		goto unlock;
+
 	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
 
 	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
@@ -508,6 +527,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
 	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
 			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
 	hwmon->ddat.reset_in_progress = false;
+	wake_up_all(&hwmon->ddat.waitq);
 
 	mutex_unlock(&hwmon->hwmon_lock);
 }
@@ -784,6 +804,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
 	ddat->uncore = &i915->uncore;
 	snprintf(ddat->name, sizeof(ddat->name), "i915");
 	ddat->gt_n = -1;
+	init_waitqueue_head(&ddat->waitq);
 
 	for_each_gt(gt, i915, i) {
 		ddat_gt = hwmon->ddat_gt + i;
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-20 15:43         ` Rodrigo Vivi
@ 2023-04-20 16:26           ` Dixit, Ashutosh
  0 siblings, 0 replies; 25+ messages in thread
From: Dixit, Ashutosh @ 2023-04-20 16:26 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel

On Thu, 20 Apr 2023 08:43:52 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Thu, Apr 20, 2023 at 08:57:24AM +0100, Tvrtko Ursulin wrote:
> >
> > On 19/04/2023 23:10, Dixit, Ashutosh wrote:
> > > On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote:
> > > >
> > >
> > > Hi Tvrtko,
> > >
> > > > On 10/04/2023 23:35, Ashutosh Dixit wrote:
> > > > > Instead of erroring out when GuC reset is in progress, block waiting for
> > > > > GuC reset to complete which is a more reasonable uapi behavior.
> > > > >
> > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > > > >
> > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > > > ---
> > > > >    drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
> > > > >    1 file changed, 33 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > index 9ab8971679fe3..8471a667dfc71 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > > > >	char name[12];
> > > > >	int gt_n;
> > > > >	bool reset_in_progress;
> > > > > +	wait_queue_head_t waitq;
> > > > >    };
> > > > >      struct i915_hwmon {
> > > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
> > > > >    static int
> > > > >    hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > > > >    {
> > > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > > > +
> > > > > +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > > >
> > > > Patch looks good to me
> > >
> > > Great, thanks :)
> > >
> > > > apart that I am not sure what is the purpose of the timeout? This is just
> > > > the sysfs write path or has more callers?
> > >
> > > It is just the sysfs path, but the sysfs is accessed also by the oneAPI
> > > stack (Level 0). In the initial version I also didn't have the timeout
> > > thinking that the app can send a signal to the blocked thread to unblock
> > > it. I introduced the timeout after Rodrigo brought it up and I am now
> > > thinking maybe it's better to have the timeout in the driver since the app
> > > has no knowledge of how long GuC resets can take. But I can remove it if
> > > you think it's not needed.
> >
> > Maybe I am missing something but I don't get why we would need to provide a
> > timeout facility in sysfs? If the library writes here to configure something
> > it already has to expect a blocking write by the nature of a a write(2) and
> > sysfs contract. It can take long for any reason so I hope we are not
> > guaranteeing some latency number to someone? Or the concern is just about
> > things getting stuck? In which case I think Ctrl-C is the answer because
> > ETIME is not even listed as an errno for write(2).

Hmm, good point.

> I suggested the timeout on the other version because of that race,
> which is fixed now with this approach. It is probably better to remove
> it now to avoid confusions. I'm sorry about that.

No problem, I've removed the timeout in the latest version.

Thanks for the R-b.

Ashutosh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-19 22:13         ` Dixit, Ashutosh
@ 2023-04-20 15:45           ` Rodrigo Vivi
  0 siblings, 0 replies; 25+ messages in thread
From: Rodrigo Vivi @ 2023-04-20 15:45 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel

On Wed, Apr 19, 2023 at 03:13:08PM -0700, Dixit, Ashutosh wrote:
> On Wed, 19 Apr 2023 12:40:44 -0700, Rodrigo Vivi wrote:
> >
> 
> Hi Rodrigo,
> 
> > On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote:
> > > On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote:
> > > >
> > >
> > > Hi Rodrigo,
> > >
> > > > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> > > > > Instead of erroring out when GuC reset is in progress, block waiting for
> > > > > GuC reset to complete which is a more reasonable uapi behavior.
> > > > >
> > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > > > >
> > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
> > > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > index 9ab8971679fe3..8471a667dfc71 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > > > >	char name[12];
> > > > >	int gt_n;
> > > > >	bool reset_in_progress;
> > > > > +	wait_queue_head_t waitq;
> > > > >  };
> > > > >
> > > > >  struct i915_hwmon {
> > > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
> > > > >  static int
> > > > >  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > > > >  {
> > > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > > > +
> > > > > +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > > > >	struct i915_hwmon *hwmon = ddat->hwmon;
> > > > >	intel_wakeref_t wakeref;
> > > > > -	int ret = 0;
> > > > > +	DEFINE_WAIT(wait);
> > > > >	u32 nval;
> > > > >
> > > > > -	mutex_lock(&hwmon->hwmon_lock);
> > > > > -	if (hwmon->ddat.reset_in_progress) {
> > > > > -		ret = -EAGAIN;
> > > > > -		goto unlock;
> > > > > +	/* Block waiting for GuC reset to complete when needed */
> > > > > +	for (;;) {
> > > > > +		mutex_lock(&hwmon->hwmon_lock);
> > > >
> > > > I'm really afraid of how this mutex is handled with the wait queue.
> > > > some initial thought it looks like it is trying to reimplement ww_mutex?
> > >
> > > Sorry, but I am missing the relation with ww_mutex. No such relation is
> > > intended.
> > >
> > > > all other examples of the wait_queue usages like this or didn't use
> > > > locks or had it in a total different flow that I could not correlate.
> > >
> > > Actually there are several examples of prepare_to_wait/finish_wait
> > > sequences with both spinlock and mutex in the kernel. See
> > > e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read().
> > >
> > > Also, as I mentioned, except for the lock, the sequence here is identical
> > > to intel_guc_wait_for_pending_msg().
> > >
> > > >
> > > > > +
> > > > > +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> > > > > +
> > > > > +		if (!hwmon->ddat.reset_in_progress)
> > > > > +			break;
> > > >
> > > > If this breaks we never unlock it?
> > >
> > > Correct, this is the original case in Patch 2 where the mutex is acquired
> > > in the beginning of the function and released just before the final exit
> > > from the function (so the mutex is held for the entire duration of the
> > > function).
> >
> > I got really confused here...
> 
> Sorry, the patch is a little confusing/tricky but I thought I'd better
> stick to the standard 'for (;;)' loop pattern otherwise it will also be
> hard to review.
> 
> > I looked at the patch 2 again and I don't see any place where the lock
> > remains outside of the function. What was what I asked to remove on the
> > initial versions.
> 
> So it was in Patch 1 where we changed the code to take the lock in the
> beginning of the function and release it at the end of the function (you
> can see it Patch 1).
> 
> In Patch 2 the 'unlock' label and 'goto unlock' is introduced and the lock
> is released at the 'unlock' label (it is visible in Patch 2).
> 
> > But now with this one I'm even more confused because I couldn't follow
> > to understand who will remove the lock and when.
> 
> In Patch 3 again the lock is released at the the 'unlock' label (i.e. the
> destination of 'goto unlock', not visible in Patch 3). But we execute 'goto
> unlock' only when 'ret != 0' in the 'for (;;)' loop. But when 'ret == 0'
> (when 'ddat.reset_in_progress' flag is clear) we hold the mutex, execute
> the entire function and finally release the lock at the end of the
> function.
> 
> Hopefully this helps.

more coffee also helped! I'm sorry for the noise.

with the timeout thing sorted out:

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


> 
> Thanks.
> --
> Ashutosh
> 
> >
> > >
> > > >
> > > > > +
> > > > > +		if (signal_pending(current)) {
> > > > > +			ret = -EINTR;
> > > > > +			break;
> > > > > +		}
> > > > > +
> > > > > +		if (!timeout) {
> > > > > +			ret = -ETIME;
> > > > > +			break;
> > > > > +		}
> > > > > +
> > > > > +		mutex_unlock(&hwmon->hwmon_lock);
> > > >
> > > > do we need to lock the signal pending and timeout as well?
> > > > or only wrapping it around the hwmon->ddat access would be
> > > > enough?
> > >
> > > Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress
> > > flag. But because this is not a performance path, implementing it as done
> > > in the patch simplifies the code flow (since there are several if/else,
> > > goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider).
> > >
> > > So if possible I *really* want to not try to over-optimize here (I did try
> > > a few other things when writing the patch but it was getting ugly). The
> > > only real requirement is to drop the lock before calling schedule_timeout()
> > > below (and we are reacquiring the lock as soon as we are scheduled back in,
> > > as you can see in the loop above).
> > >
> > > >
> > > > > +
> > > > > +		timeout = schedule_timeout(timeout);
> > > > >	}
> > > > > +	finish_wait(&ddat->waitq, &wait);
> > > > > +	if (ret)
> > > > > +		goto unlock;
> > > > > +
> > > > >	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> > > > >
> > > > >	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> > > > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> > > > >	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > > > >			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > > > >	hwmon->ddat.reset_in_progress = false;
> > > > > +	wake_up_all(&hwmon->ddat.waitq);
> > > > >
> > > > >	mutex_unlock(&hwmon->hwmon_lock);
> > > > >  }
> > > > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > > > >	ddat->uncore = &i915->uncore;
> > > > >	snprintf(ddat->name, sizeof(ddat->name), "i915");
> > > > >	ddat->gt_n = -1;
> > > > > +	init_waitqueue_head(&ddat->waitq);
> > > > >
> > > > >	for_each_gt(gt, i915, i) {
> > > > >		ddat_gt = hwmon->ddat_gt + i;
> > > > > --
> > > > > 2.38.0
> > > > >
> > >
> > > From what I understand is the locking above is fine and is not the
> > > point. The real race is between schedule_timeout() (which suspends the
> > > thread) and wake_up_all() (which schedules it back in). But this
> > > prepare_to_wait/finish_wait pattern is so widespread that the kernel
> > > guarantees that this works correctly as long as you do things in the
> > > correct order (otherwise we'd see a lot more kernel hangs/deadlocks).
> > >
> > > Thanks,
> > > Ashutosh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-20  7:57       ` Tvrtko Ursulin
@ 2023-04-20 15:43         ` Rodrigo Vivi
  2023-04-20 16:26           ` Dixit, Ashutosh
  0 siblings, 1 reply; 25+ messages in thread
From: Rodrigo Vivi @ 2023-04-20 15:43 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel

On Thu, Apr 20, 2023 at 08:57:24AM +0100, Tvrtko Ursulin wrote:
> 
> On 19/04/2023 23:10, Dixit, Ashutosh wrote:
> > On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote:
> > > 
> > 
> > Hi Tvrtko,
> > 
> > > On 10/04/2023 23:35, Ashutosh Dixit wrote:
> > > > Instead of erroring out when GuC reset is in progress, block waiting for
> > > > GuC reset to complete which is a more reasonable uapi behavior.
> > > > 
> > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > > > 
> > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > > ---
> > > >    drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
> > > >    1 file changed, 33 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > index 9ab8971679fe3..8471a667dfc71 100644
> > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > > > 	char name[12];
> > > > 	int gt_n;
> > > > 	bool reset_in_progress;
> > > > +	wait_queue_head_t waitq;
> > > >    };
> > > >      struct i915_hwmon {
> > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
> > > >    static int
> > > >    hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > > >    {
> > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > > +
> > > > +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > > 
> > > Patch looks good to me
> > 
> > Great, thanks :)
> > 
> > > apart that I am not sure what is the purpose of the timeout? This is just
> > > the sysfs write path or has more callers?
> > 
> > It is just the sysfs path, but the sysfs is accessed also by the oneAPI
> > stack (Level 0). In the initial version I also didn't have the timeout
> > thinking that the app can send a signal to the blocked thread to unblock
> > it. I introduced the timeout after Rodrigo brought it up and I am now
> > thinking maybe it's better to have the timeout in the driver since the app
> > has no knowledge of how long GuC resets can take. But I can remove it if
> > you think it's not needed.
> 
> Maybe I am missing something but I don't get why we would need to provide a
> timeout facility in sysfs? If the library writes here to configure something
> it already has to expect a blocking write by the nature of a a write(2) and
> sysfs contract. It can take long for any reason so I hope we are not
> guaranteeing some latency number to someone? Or the concern is just about
> things getting stuck? In which case I think Ctrl-C is the answer because
> ETIME is not even listed as an errno for write(2).

I suggested the timeout on the other version because of that race,
which is fixed now with this approach. It is probably better to remove
it now to avoid confusions. I'm sorry about that.

> 
> > > If the
> > > former perhaps it would be better to just use interruptible everything
> > > (mutex and sleep) and wait for as long as it takes or until user presses
> > > Ctrl-C?
> > 
> > Now we are not holding the mutexes for long, just long enough do register
> > rmw's. So not holding the mutex across GuC reset as we were
> > originally. Therefore I am thinking mutex_lock_interruptible is not needed?
> > The sleep is already interruptible (TASK_INTERRUPTIBLE).
> 
> Ah yes, you are right.
> 
> Regards,
> 
> Tvrtko
> 
> > Anyway please let me know if you think we need to change anything.
> > 
> > Thanks.
> > --
> > Ashutosh
> > 
> > > > 	struct i915_hwmon *hwmon = ddat->hwmon;
> > > > 	intel_wakeref_t wakeref;
> > > > -	int ret = 0;
> > > > +	DEFINE_WAIT(wait);
> > > > 	u32 nval;
> > > >    -	mutex_lock(&hwmon->hwmon_lock);
> > > > -	if (hwmon->ddat.reset_in_progress) {
> > > > -		ret = -EAGAIN;
> > > > -		goto unlock;
> > > > +	/* Block waiting for GuC reset to complete when needed */
> > > > +	for (;;) {
> > > > +		mutex_lock(&hwmon->hwmon_lock);
> > > > +
> > > > +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> > > > +
> > > > +		if (!hwmon->ddat.reset_in_progress)
> > > > +			break;
> > > > +
> > > > +		if (signal_pending(current)) {
> > > > +			ret = -EINTR;
> > > > +			break;
> > > > +		}
> > > > +
> > > > +		if (!timeout) {
> > > > +			ret = -ETIME;
> > > > +			break;
> > > > +		}
> > > > +
> > > > +		mutex_unlock(&hwmon->hwmon_lock);
> > > > +
> > > > +		timeout = schedule_timeout(timeout);
> > > > 	}
> > > > +	finish_wait(&ddat->waitq, &wait);
> > > > +	if (ret)
> > > > +		goto unlock;
> > > > +
> > > > 	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> > > > 		/* Disable PL1 limit and verify, because the limit cannot be
> > > > disabled on all platforms */
> > > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> > > > 	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > > > 			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > > > 	hwmon->ddat.reset_in_progress = false;
> > > > +	wake_up_all(&hwmon->ddat.waitq);
> > > > 		mutex_unlock(&hwmon->hwmon_lock);
> > > >    }
> > > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > > > 	ddat->uncore = &i915->uncore;
> > > > 	snprintf(ddat->name, sizeof(ddat->name), "i915");
> > > > 	ddat->gt_n = -1;
> > > > +	init_waitqueue_head(&ddat->waitq);
> > > > 		for_each_gt(gt, i915, i) {
> > > > 		ddat_gt = hwmon->ddat_gt + i;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-19 22:10     ` Dixit, Ashutosh
@ 2023-04-20  7:57       ` Tvrtko Ursulin
  2023-04-20 15:43         ` Rodrigo Vivi
  0 siblings, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2023-04-20  7:57 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel, Rodrigo Vivi


On 19/04/2023 23:10, Dixit, Ashutosh wrote:
> On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote:
>>
> 
> Hi Tvrtko,
> 
>> On 10/04/2023 23:35, Ashutosh Dixit wrote:
>>> Instead of erroring out when GuC reset is in progress, block waiting for
>>> GuC reset to complete which is a more reasonable uapi behavior.
>>>
>>> v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
>>>
>>> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
>>>    1 file changed, 33 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
>>> index 9ab8971679fe3..8471a667dfc71 100644
>>> --- a/drivers/gpu/drm/i915/i915_hwmon.c
>>> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
>>> @@ -51,6 +51,7 @@ struct hwm_drvdata {
>>> 	char name[12];
>>> 	int gt_n;
>>> 	bool reset_in_progress;
>>> +	wait_queue_head_t waitq;
>>>    };
>>>      struct i915_hwmon {
>>> @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
>>>    static int
>>>    hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>>>    {
>>> +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
>>> +
>>> +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
>>
>> Patch looks good to me
> 
> Great, thanks :)
> 
>> apart that I am not sure what is the purpose of the timeout? This is just
>> the sysfs write path or has more callers?
> 
> It is just the sysfs path, but the sysfs is accessed also by the oneAPI
> stack (Level 0). In the initial version I also didn't have the timeout
> thinking that the app can send a signal to the blocked thread to unblock
> it. I introduced the timeout after Rodrigo brought it up and I am now
> thinking maybe it's better to have the timeout in the driver since the app
> has no knowledge of how long GuC resets can take. But I can remove it if
> you think it's not needed.

Maybe I am missing something but I don't get why we would need to 
provide a timeout facility in sysfs? If the library writes here to 
configure something it already has to expect a blocking write by the 
nature of a a write(2) and sysfs contract. It can take long for any 
reason so I hope we are not guaranteeing some latency number to someone? 
Or the concern is just about things getting stuck? In which case I think 
Ctrl-C is the answer because ETIME is not even listed as an errno for 
write(2).

>> If the
>> former perhaps it would be better to just use interruptible everything
>> (mutex and sleep) and wait for as long as it takes or until user presses
>> Ctrl-C?
> 
> Now we are not holding the mutexes for long, just long enough do register
> rmw's. So not holding the mutex across GuC reset as we were
> originally. Therefore I am thinking mutex_lock_interruptible is not needed?
> The sleep is already interruptible (TASK_INTERRUPTIBLE).

Ah yes, you are right.

Regards,

Tvrtko

> Anyway please let me know if you think we need to change anything.
> 
> Thanks.
> --
> Ashutosh
> 
>>> 	struct i915_hwmon *hwmon = ddat->hwmon;
>>> 	intel_wakeref_t wakeref;
>>> -	int ret = 0;
>>> +	DEFINE_WAIT(wait);
>>> 	u32 nval;
>>>    -	mutex_lock(&hwmon->hwmon_lock);
>>> -	if (hwmon->ddat.reset_in_progress) {
>>> -		ret = -EAGAIN;
>>> -		goto unlock;
>>> +	/* Block waiting for GuC reset to complete when needed */
>>> +	for (;;) {
>>> +		mutex_lock(&hwmon->hwmon_lock);
>>> +
>>> +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
>>> +
>>> +		if (!hwmon->ddat.reset_in_progress)
>>> +			break;
>>> +
>>> +		if (signal_pending(current)) {
>>> +			ret = -EINTR;
>>> +			break;
>>> +		}
>>> +
>>> +		if (!timeout) {
>>> +			ret = -ETIME;
>>> +			break;
>>> +		}
>>> +
>>> +		mutex_unlock(&hwmon->hwmon_lock);
>>> +
>>> +		timeout = schedule_timeout(timeout);
>>> 	}
>>> +	finish_wait(&ddat->waitq, &wait);
>>> +	if (ret)
>>> +		goto unlock;
>>> +
>>> 	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
>>> 		/* Disable PL1 limit and verify, because the limit cannot be
>>> disabled on all platforms */
>>> @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
>>> 	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
>>> 			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
>>> 	hwmon->ddat.reset_in_progress = false;
>>> +	wake_up_all(&hwmon->ddat.waitq);
>>> 		mutex_unlock(&hwmon->hwmon_lock);
>>>    }
>>> @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
>>> 	ddat->uncore = &i915->uncore;
>>> 	snprintf(ddat->name, sizeof(ddat->name), "i915");
>>> 	ddat->gt_n = -1;
>>> +	init_waitqueue_head(&ddat->waitq);
>>> 		for_each_gt(gt, i915, i) {
>>> 		ddat_gt = hwmon->ddat_gt + i;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-19 19:40       ` Rodrigo Vivi
@ 2023-04-19 22:13         ` Dixit, Ashutosh
  2023-04-20 15:45           ` Rodrigo Vivi
  0 siblings, 1 reply; 25+ messages in thread
From: Dixit, Ashutosh @ 2023-04-19 22:13 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel

On Wed, 19 Apr 2023 12:40:44 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote:
> > On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote:
> > >
> >
> > Hi Rodrigo,
> >
> > > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> > > > Instead of erroring out when GuC reset is in progress, block waiting for
> > > > GuC reset to complete which is a more reasonable uapi behavior.
> > > >
> > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > > >
> > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
> > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > index 9ab8971679fe3..8471a667dfc71 100644
> > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > > >	char name[12];
> > > >	int gt_n;
> > > >	bool reset_in_progress;
> > > > +	wait_queue_head_t waitq;
> > > >  };
> > > >
> > > >  struct i915_hwmon {
> > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
> > > >  static int
> > > >  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > > >  {
> > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > > +
> > > > +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > > >	struct i915_hwmon *hwmon = ddat->hwmon;
> > > >	intel_wakeref_t wakeref;
> > > > -	int ret = 0;
> > > > +	DEFINE_WAIT(wait);
> > > >	u32 nval;
> > > >
> > > > -	mutex_lock(&hwmon->hwmon_lock);
> > > > -	if (hwmon->ddat.reset_in_progress) {
> > > > -		ret = -EAGAIN;
> > > > -		goto unlock;
> > > > +	/* Block waiting for GuC reset to complete when needed */
> > > > +	for (;;) {
> > > > +		mutex_lock(&hwmon->hwmon_lock);
> > >
> > > I'm really afraid of how this mutex is handled with the wait queue.
> > > some initial thought it looks like it is trying to reimplement ww_mutex?
> >
> > Sorry, but I am missing the relation with ww_mutex. No such relation is
> > intended.
> >
> > > all other examples of the wait_queue usages like this or didn't use
> > > locks or had it in a total different flow that I could not correlate.
> >
> > Actually there are several examples of prepare_to_wait/finish_wait
> > sequences with both spinlock and mutex in the kernel. See
> > e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read().
> >
> > Also, as I mentioned, except for the lock, the sequence here is identical
> > to intel_guc_wait_for_pending_msg().
> >
> > >
> > > > +
> > > > +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> > > > +
> > > > +		if (!hwmon->ddat.reset_in_progress)
> > > > +			break;
> > >
> > > If this breaks we never unlock it?
> >
> > Correct, this is the original case in Patch 2 where the mutex is acquired
> > in the beginning of the function and released just before the final exit
> > from the function (so the mutex is held for the entire duration of the
> > function).
>
> I got really confused here...

Sorry, the patch is a little confusing/tricky but I thought I'd better
stick to the standard 'for (;;)' loop pattern otherwise it will also be
hard to review.

> I looked at the patch 2 again and I don't see any place where the lock
> remains outside of the function. What was what I asked to remove on the
> initial versions.

So it was in Patch 1 where we changed the code to take the lock in the
beginning of the function and release it at the end of the function (you
can see it Patch 1).

In Patch 2 the 'unlock' label and 'goto unlock' is introduced and the lock
is released at the 'unlock' label (it is visible in Patch 2).

> But now with this one I'm even more confused because I couldn't follow
> to understand who will remove the lock and when.

In Patch 3 again the lock is released at the the 'unlock' label (i.e. the
destination of 'goto unlock', not visible in Patch 3). But we execute 'goto
unlock' only when 'ret != 0' in the 'for (;;)' loop. But when 'ret == 0'
(when 'ddat.reset_in_progress' flag is clear) we hold the mutex, execute
the entire function and finally release the lock at the end of the
function.

Hopefully this helps.

Thanks.
--
Ashutosh

>
> >
> > >
> > > > +
> > > > +		if (signal_pending(current)) {
> > > > +			ret = -EINTR;
> > > > +			break;
> > > > +		}
> > > > +
> > > > +		if (!timeout) {
> > > > +			ret = -ETIME;
> > > > +			break;
> > > > +		}
> > > > +
> > > > +		mutex_unlock(&hwmon->hwmon_lock);
> > >
> > > do we need to lock the signal pending and timeout as well?
> > > or only wrapping it around the hwmon->ddat access would be
> > > enough?
> >
> > Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress
> > flag. But because this is not a performance path, implementing it as done
> > in the patch simplifies the code flow (since there are several if/else,
> > goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider).
> >
> > So if possible I *really* want to not try to over-optimize here (I did try
> > a few other things when writing the patch but it was getting ugly). The
> > only real requirement is to drop the lock before calling schedule_timeout()
> > below (and we are reacquiring the lock as soon as we are scheduled back in,
> > as you can see in the loop above).
> >
> > >
> > > > +
> > > > +		timeout = schedule_timeout(timeout);
> > > >	}
> > > > +	finish_wait(&ddat->waitq, &wait);
> > > > +	if (ret)
> > > > +		goto unlock;
> > > > +
> > > >	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> > > >
> > > >	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> > > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> > > >	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > > >			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > > >	hwmon->ddat.reset_in_progress = false;
> > > > +	wake_up_all(&hwmon->ddat.waitq);
> > > >
> > > >	mutex_unlock(&hwmon->hwmon_lock);
> > > >  }
> > > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > > >	ddat->uncore = &i915->uncore;
> > > >	snprintf(ddat->name, sizeof(ddat->name), "i915");
> > > >	ddat->gt_n = -1;
> > > > +	init_waitqueue_head(&ddat->waitq);
> > > >
> > > >	for_each_gt(gt, i915, i) {
> > > >		ddat_gt = hwmon->ddat_gt + i;
> > > > --
> > > > 2.38.0
> > > >
> >
> > From what I understand is the locking above is fine and is not the
> > point. The real race is between schedule_timeout() (which suspends the
> > thread) and wake_up_all() (which schedules it back in). But this
> > prepare_to_wait/finish_wait pattern is so widespread that the kernel
> > guarantees that this works correctly as long as you do things in the
> > correct order (otherwise we'd see a lot more kernel hangs/deadlocks).
> >
> > Thanks,
> > Ashutosh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-19 13:21   ` Tvrtko Ursulin
@ 2023-04-19 22:10     ` Dixit, Ashutosh
  2023-04-20  7:57       ` Tvrtko Ursulin
  0 siblings, 1 reply; 25+ messages in thread
From: Dixit, Ashutosh @ 2023-04-19 22:10 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 10/04/2023 23:35, Ashutosh Dixit wrote:
> > Instead of erroring out when GuC reset is in progress, block waiting for
> > GuC reset to complete which is a more reasonable uapi behavior.
> >
> > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> >
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
> >   1 file changed, 33 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9ab8971679fe3..8471a667dfc71 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> >	char name[12];
> >	int gt_n;
> >	bool reset_in_progress;
> > +	wait_queue_head_t waitq;
> >   };
> >     struct i915_hwmon {
> > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
> >   static int
> >   hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >   {
> > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > +
> > +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
>
> Patch looks good to me

Great, thanks :)

> apart that I am not sure what is the purpose of the timeout? This is just
> the sysfs write path or has more callers?

It is just the sysfs path, but the sysfs is accessed also by the oneAPI
stack (Level 0). In the initial version I also didn't have the timeout
thinking that the app can send a signal to the blocked thread to unblock
it. I introduced the timeout after Rodrigo brought it up and I am now
thinking maybe it's better to have the timeout in the driver since the app
has no knowledge of how long GuC resets can take. But I can remove it if
you think it's not needed.

> If the
> former perhaps it would be better to just use interruptible everything
> (mutex and sleep) and wait for as long as it takes or until user presses
> Ctrl-C?

Now we are not holding the mutexes for long, just long enough do register
rmw's. So not holding the mutex across GuC reset as we were
originally. Therefore I am thinking mutex_lock_interruptible is not needed?
The sleep is already interruptible (TASK_INTERRUPTIBLE).

Anyway please let me know if you think we need to change anything.

Thanks.
--
Ashutosh

> >	struct i915_hwmon *hwmon = ddat->hwmon;
> >	intel_wakeref_t wakeref;
> > -	int ret = 0;
> > +	DEFINE_WAIT(wait);
> >	u32 nval;
> >   -	mutex_lock(&hwmon->hwmon_lock);
> > -	if (hwmon->ddat.reset_in_progress) {
> > -		ret = -EAGAIN;
> > -		goto unlock;
> > +	/* Block waiting for GuC reset to complete when needed */
> > +	for (;;) {
> > +		mutex_lock(&hwmon->hwmon_lock);
> > +
> > +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> > +
> > +		if (!hwmon->ddat.reset_in_progress)
> > +			break;
> > +
> > +		if (signal_pending(current)) {
> > +			ret = -EINTR;
> > +			break;
> > +		}
> > +
> > +		if (!timeout) {
> > +			ret = -ETIME;
> > +			break;
> > +		}
> > +
> > +		mutex_unlock(&hwmon->hwmon_lock);
> > +
> > +		timeout = schedule_timeout(timeout);
> >	}
> > +	finish_wait(&ddat->waitq, &wait);
> > +	if (ret)
> > +		goto unlock;
> > +
> >	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >		/* Disable PL1 limit and verify, because the limit cannot be
> > disabled on all platforms */
> > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> >	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> >			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> >	hwmon->ddat.reset_in_progress = false;
> > +	wake_up_all(&hwmon->ddat.waitq);
> >		mutex_unlock(&hwmon->hwmon_lock);
> >   }
> > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> >	ddat->uncore = &i915->uncore;
> >	snprintf(ddat->name, sizeof(ddat->name), "i915");
> >	ddat->gt_n = -1;
> > +	init_waitqueue_head(&ddat->waitq);
> >		for_each_gt(gt, i915, i) {
> >		ddat_gt = hwmon->ddat_gt + i;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-18 17:23     ` Dixit, Ashutosh
@ 2023-04-19 19:40       ` Rodrigo Vivi
  2023-04-19 22:13         ` Dixit, Ashutosh
  0 siblings, 1 reply; 25+ messages in thread
From: Rodrigo Vivi @ 2023-04-19 19:40 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel

On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote:
> On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote:
> >
> 
> Hi Rodrigo,
> 
> > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> > > Instead of erroring out when GuC reset is in progress, block waiting for
> > > GuC reset to complete which is a more reasonable uapi behavior.
> > >
> > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > >
> > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
> > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > > index 9ab8971679fe3..8471a667dfc71 100644
> > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > >	char name[12];
> > >	int gt_n;
> > >	bool reset_in_progress;
> > > +	wait_queue_head_t waitq;
> > >  };
> > >
> > >  struct i915_hwmon {
> > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
> > >  static int
> > >  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > >  {
> > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > +
> > > +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > >	struct i915_hwmon *hwmon = ddat->hwmon;
> > >	intel_wakeref_t wakeref;
> > > -	int ret = 0;
> > > +	DEFINE_WAIT(wait);
> > >	u32 nval;
> > >
> > > -	mutex_lock(&hwmon->hwmon_lock);
> > > -	if (hwmon->ddat.reset_in_progress) {
> > > -		ret = -EAGAIN;
> > > -		goto unlock;
> > > +	/* Block waiting for GuC reset to complete when needed */
> > > +	for (;;) {
> > > +		mutex_lock(&hwmon->hwmon_lock);
> >
> > I'm really afraid of how this mutex is handled with the wait queue.
> > some initial thought it looks like it is trying to reimplement ww_mutex?
> 
> Sorry, but I am missing the relation with ww_mutex. No such relation is
> intended.
> 
> > all other examples of the wait_queue usages like this or didn't use
> > locks or had it in a total different flow that I could not correlate.
> 
> Actually there are several examples of prepare_to_wait/finish_wait
> sequences with both spinlock and mutex in the kernel. See
> e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read().
> 
> Also, as I mentioned, except for the lock, the sequence here is identical
> to intel_guc_wait_for_pending_msg().
> 
> >
> > > +
> > > +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> > > +
> > > +		if (!hwmon->ddat.reset_in_progress)
> > > +			break;
> >
> > If this breaks we never unlock it?
> 
> Correct, this is the original case in Patch 2 where the mutex is acquired
> in the beginning of the function and released just before the final exit
> from the function (so the mutex is held for the entire duration of the
> function).

I got really confused here... I looked at the patch 2 again and I don't
see any place where the lock remains outside of the function. What was
what I asked to remove on the initial versions.

But now with this one I'm even more confused because I couldn't follow
to understand who will remove the lock and when.

> 
> >
> > > +
> > > +		if (signal_pending(current)) {
> > > +			ret = -EINTR;
> > > +			break;
> > > +		}
> > > +
> > > +		if (!timeout) {
> > > +			ret = -ETIME;
> > > +			break;
> > > +		}
> > > +
> > > +		mutex_unlock(&hwmon->hwmon_lock);
> >
> > do we need to lock the signal pending and timeout as well?
> > or only wrapping it around the hwmon->ddat access would be
> > enough?
> 
> Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress
> flag. But because this is not a performance path, implementing it as done
> in the patch simplifies the code flow (since there are several if/else,
> goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider).
> 
> So if possible I *really* want to not try to over-optimize here (I did try
> a few other things when writing the patch but it was getting ugly). The
> only real requirement is to drop the lock before calling schedule_timeout()
> below (and we are reacquiring the lock as soon as we are scheduled back in,
> as you can see in the loop above).
> 
> >
> > > +
> > > +		timeout = schedule_timeout(timeout);
> > >	}
> > > +	finish_wait(&ddat->waitq, &wait);
> > > +	if (ret)
> > > +		goto unlock;
> > > +
> > >	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> > >
> > >	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> > >	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > >			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > >	hwmon->ddat.reset_in_progress = false;
> > > +	wake_up_all(&hwmon->ddat.waitq);
> > >
> > >	mutex_unlock(&hwmon->hwmon_lock);
> > >  }
> > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > >	ddat->uncore = &i915->uncore;
> > >	snprintf(ddat->name, sizeof(ddat->name), "i915");
> > >	ddat->gt_n = -1;
> > > +	init_waitqueue_head(&ddat->waitq);
> > >
> > >	for_each_gt(gt, i915, i) {
> > >		ddat_gt = hwmon->ddat_gt + i;
> > > --
> > > 2.38.0
> > >
> 
> From what I understand is the locking above is fine and is not the
> point. The real race is between schedule_timeout() (which suspends the
> thread) and wake_up_all() (which schedules it back in). But this
> prepare_to_wait/finish_wait pattern is so widespread that the kernel
> guarantees that this works correctly as long as you do things in the
> correct order (otherwise we'd see a lot more kernel hangs/deadlocks).
> 
> Thanks,
> Ashutosh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-10 22:35 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit
  2023-04-18  5:35   ` Rodrigo Vivi
@ 2023-04-19 13:21   ` Tvrtko Ursulin
  2023-04-19 22:10     ` Dixit, Ashutosh
  1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2023-04-19 13:21 UTC (permalink / raw)
  To: Ashutosh Dixit, intel-gfx; +Cc: dri-devel, Rodrigo Vivi


On 10/04/2023 23:35, Ashutosh Dixit wrote:
> Instead of erroring out when GuC reset is in progress, block waiting for
> GuC reset to complete which is a more reasonable uapi behavior.
> 
> v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> 
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
>   1 file changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> index 9ab8971679fe3..8471a667dfc71 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.c
> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> @@ -51,6 +51,7 @@ struct hwm_drvdata {
>   	char name[12];
>   	int gt_n;
>   	bool reset_in_progress;
> +	wait_queue_head_t waitq;
>   };
>   
>   struct i915_hwmon {
> @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
>   static int
>   hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>   {
> +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> +
> +	int ret = 0, timeout = GUC_RESET_TIMEOUT;

Patch looks good to me apart that I am not sure what is the purpose of 
the timeout? This is just the sysfs write path or has more callers? If 
the former perhaps it would be better to just use interruptible 
everything (mutex and sleep) and wait for as long as it takes or until 
user presses Ctrl-C?

Regards,

Tvrtko

>   	struct i915_hwmon *hwmon = ddat->hwmon;
>   	intel_wakeref_t wakeref;
> -	int ret = 0;
> +	DEFINE_WAIT(wait);
>   	u32 nval;
>   
> -	mutex_lock(&hwmon->hwmon_lock);
> -	if (hwmon->ddat.reset_in_progress) {
> -		ret = -EAGAIN;
> -		goto unlock;
> +	/* Block waiting for GuC reset to complete when needed */
> +	for (;;) {
> +		mutex_lock(&hwmon->hwmon_lock);
> +
> +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> +
> +		if (!hwmon->ddat.reset_in_progress)
> +			break;
> +
> +		if (signal_pending(current)) {
> +			ret = -EINTR;
> +			break;
> +		}
> +
> +		if (!timeout) {
> +			ret = -ETIME;
> +			break;
> +		}
> +
> +		mutex_unlock(&hwmon->hwmon_lock);
> +
> +		timeout = schedule_timeout(timeout);
>   	}
> +	finish_wait(&ddat->waitq, &wait);
> +	if (ret)
> +		goto unlock;
> +
>   	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
>   
>   	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
>   	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
>   			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
>   	hwmon->ddat.reset_in_progress = false;
> +	wake_up_all(&hwmon->ddat.waitq);
>   
>   	mutex_unlock(&hwmon->hwmon_lock);
>   }
> @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
>   	ddat->uncore = &i915->uncore;
>   	snprintf(ddat->name, sizeof(ddat->name), "i915");
>   	ddat->gt_n = -1;
> +	init_waitqueue_head(&ddat->waitq);
>   
>   	for_each_gt(gt, i915, i) {
>   		ddat_gt = hwmon->ddat_gt + i;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-18  5:35   ` Rodrigo Vivi
@ 2023-04-18 17:23     ` Dixit, Ashutosh
  2023-04-19 19:40       ` Rodrigo Vivi
  0 siblings, 1 reply; 25+ messages in thread
From: Dixit, Ashutosh @ 2023-04-18 17:23 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel

On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> > Instead of erroring out when GuC reset is in progress, block waiting for
> > GuC reset to complete which is a more reasonable uapi behavior.
> >
> > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> >
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
> >  1 file changed, 33 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9ab8971679fe3..8471a667dfc71 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> >	char name[12];
> >	int gt_n;
> >	bool reset_in_progress;
> > +	wait_queue_head_t waitq;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
> >  static int
> >  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >  {
> > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > +
> > +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
> >	struct i915_hwmon *hwmon = ddat->hwmon;
> >	intel_wakeref_t wakeref;
> > -	int ret = 0;
> > +	DEFINE_WAIT(wait);
> >	u32 nval;
> >
> > -	mutex_lock(&hwmon->hwmon_lock);
> > -	if (hwmon->ddat.reset_in_progress) {
> > -		ret = -EAGAIN;
> > -		goto unlock;
> > +	/* Block waiting for GuC reset to complete when needed */
> > +	for (;;) {
> > +		mutex_lock(&hwmon->hwmon_lock);
>
> I'm really afraid of how this mutex is handled with the wait queue.
> some initial thought it looks like it is trying to reimplement ww_mutex?

Sorry, but I am missing the relation with ww_mutex. No such relation is
intended.

> all other examples of the wait_queue usages like this or didn't use
> locks or had it in a total different flow that I could not correlate.

Actually there are several examples of prepare_to_wait/finish_wait
sequences with both spinlock and mutex in the kernel. See
e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read().

Also, as I mentioned, except for the lock, the sequence here is identical
to intel_guc_wait_for_pending_msg().

>
> > +
> > +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> > +
> > +		if (!hwmon->ddat.reset_in_progress)
> > +			break;
>
> If this breaks we never unlock it?

Correct, this is the original case in Patch 2 where the mutex is acquired
in the beginning of the function and released just before the final exit
from the function (so the mutex is held for the entire duration of the
function).

>
> > +
> > +		if (signal_pending(current)) {
> > +			ret = -EINTR;
> > +			break;
> > +		}
> > +
> > +		if (!timeout) {
> > +			ret = -ETIME;
> > +			break;
> > +		}
> > +
> > +		mutex_unlock(&hwmon->hwmon_lock);
>
> do we need to lock the signal pending and timeout as well?
> or only wrapping it around the hwmon->ddat access would be
> enough?

Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress
flag. But because this is not a performance path, implementing it as done
in the patch simplifies the code flow (since there are several if/else,
goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider).

So if possible I *really* want to not try to over-optimize here (I did try
a few other things when writing the patch but it was getting ugly). The
only real requirement is to drop the lock before calling schedule_timeout()
below (and we are reacquiring the lock as soon as we are scheduled back in,
as you can see in the loop above).

>
> > +
> > +		timeout = schedule_timeout(timeout);
> >	}
> > +	finish_wait(&ddat->waitq, &wait);
> > +	if (ret)
> > +		goto unlock;
> > +
> >	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> >	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> >	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> >			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> >	hwmon->ddat.reset_in_progress = false;
> > +	wake_up_all(&hwmon->ddat.waitq);
> >
> >	mutex_unlock(&hwmon->hwmon_lock);
> >  }
> > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> >	ddat->uncore = &i915->uncore;
> >	snprintf(ddat->name, sizeof(ddat->name), "i915");
> >	ddat->gt_n = -1;
> > +	init_waitqueue_head(&ddat->waitq);
> >
> >	for_each_gt(gt, i915, i) {
> >		ddat_gt = hwmon->ddat_gt + i;
> > --
> > 2.38.0
> >

From what I understand is the locking above is fine and is not the
point. The real race is between schedule_timeout() (which suspends the
thread) and wake_up_all() (which schedules it back in). But this
prepare_to_wait/finish_wait pattern is so widespread that the kernel
guarantees that this works correctly as long as you do things in the
correct order (otherwise we'd see a lot more kernel hangs/deadlocks).

Thanks,
Ashutosh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-10 22:35 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit
@ 2023-04-18  5:35   ` Rodrigo Vivi
  2023-04-18 17:23     ` Dixit, Ashutosh
  2023-04-19 13:21   ` Tvrtko Ursulin
  1 sibling, 1 reply; 25+ messages in thread
From: Rodrigo Vivi @ 2023-04-18  5:35 UTC (permalink / raw)
  To: Ashutosh Dixit; +Cc: intel-gfx, dri-devel

On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> Instead of erroring out when GuC reset is in progress, block waiting for
> GuC reset to complete which is a more reasonable uapi behavior.
> 
> v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> 
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
>  1 file changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> index 9ab8971679fe3..8471a667dfc71 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.c
> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> @@ -51,6 +51,7 @@ struct hwm_drvdata {
>  	char name[12];
>  	int gt_n;
>  	bool reset_in_progress;
> +	wait_queue_head_t waitq;
>  };
>  
>  struct i915_hwmon {
> @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
>  static int
>  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>  {
> +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> +
> +	int ret = 0, timeout = GUC_RESET_TIMEOUT;
>  	struct i915_hwmon *hwmon = ddat->hwmon;
>  	intel_wakeref_t wakeref;
> -	int ret = 0;
> +	DEFINE_WAIT(wait);
>  	u32 nval;
>  
> -	mutex_lock(&hwmon->hwmon_lock);
> -	if (hwmon->ddat.reset_in_progress) {
> -		ret = -EAGAIN;
> -		goto unlock;
> +	/* Block waiting for GuC reset to complete when needed */
> +	for (;;) {
> +		mutex_lock(&hwmon->hwmon_lock);

I'm really afraid of how this mutex is handled with the wait queue.
some initial thought it looks like it is trying to reimplement ww_mutex?

all other examples of the wait_queue usages like this or didn't use
locks or had it in a total different flow that I could not correlate.

> +
> +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> +
> +		if (!hwmon->ddat.reset_in_progress)
> +			break;

If this breaks we never unlock it?

> +
> +		if (signal_pending(current)) {
> +			ret = -EINTR;
> +			break;
> +		}
> +
> +		if (!timeout) {
> +			ret = -ETIME;
> +			break;
> +		}
> +
> +		mutex_unlock(&hwmon->hwmon_lock);

do we need to lock the signal pending and timeout as well?
or only wrapping it around the hwmon->ddat access would be
enough?

> +
> +		timeout = schedule_timeout(timeout);
>  	}
> +	finish_wait(&ddat->waitq, &wait);
> +	if (ret)
> +		goto unlock;
> +
>  	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
>  
>  	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
>  	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
>  			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
>  	hwmon->ddat.reset_in_progress = false;
> +	wake_up_all(&hwmon->ddat.waitq);
>  
>  	mutex_unlock(&hwmon->hwmon_lock);
>  }
> @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
>  	ddat->uncore = &i915->uncore;
>  	snprintf(ddat->name, sizeof(ddat->name), "i915");
>  	ddat->gt_n = -1;
> +	init_waitqueue_head(&ddat->waitq);
>  
>  	for_each_gt(gt, i915, i) {
>  		ddat_gt = hwmon->ddat_gt + i;
> -- 
> 2.38.0
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete
  2023-04-10 22:35 [Intel-gfx] [PATCH 0/3] " Ashutosh Dixit
@ 2023-04-10 22:35 ` Ashutosh Dixit
  2023-04-18  5:35   ` Rodrigo Vivi
  2023-04-19 13:21   ` Tvrtko Ursulin
  0 siblings, 2 replies; 25+ messages in thread
From: Ashutosh Dixit @ 2023-04-10 22:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi

Instead of erroring out when GuC reset is in progress, block waiting for
GuC reset to complete which is a more reasonable uapi behavior.

v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++----
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
index 9ab8971679fe3..8471a667dfc71 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -51,6 +51,7 @@ struct hwm_drvdata {
 	char name[12];
 	int gt_n;
 	bool reset_in_progress;
+	wait_queue_head_t waitq;
 };
 
 struct i915_hwmon {
@@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
 static int
 hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 {
+#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
+
+	int ret = 0, timeout = GUC_RESET_TIMEOUT;
 	struct i915_hwmon *hwmon = ddat->hwmon;
 	intel_wakeref_t wakeref;
-	int ret = 0;
+	DEFINE_WAIT(wait);
 	u32 nval;
 
-	mutex_lock(&hwmon->hwmon_lock);
-	if (hwmon->ddat.reset_in_progress) {
-		ret = -EAGAIN;
-		goto unlock;
+	/* Block waiting for GuC reset to complete when needed */
+	for (;;) {
+		mutex_lock(&hwmon->hwmon_lock);
+
+		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
+
+		if (!hwmon->ddat.reset_in_progress)
+			break;
+
+		if (signal_pending(current)) {
+			ret = -EINTR;
+			break;
+		}
+
+		if (!timeout) {
+			ret = -ETIME;
+			break;
+		}
+
+		mutex_unlock(&hwmon->hwmon_lock);
+
+		timeout = schedule_timeout(timeout);
 	}
+	finish_wait(&ddat->waitq, &wait);
+	if (ret)
+		goto unlock;
+
 	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
 
 	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
@@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
 	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
 			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
 	hwmon->ddat.reset_in_progress = false;
+	wake_up_all(&hwmon->ddat.waitq);
 
 	mutex_unlock(&hwmon->hwmon_lock);
 }
@@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
 	ddat->uncore = &i915->uncore;
 	snprintf(ddat->name, sizeof(ddat->name), "i915");
 	ddat->gt_n = -1;
+	init_waitqueue_head(&ddat->waitq);
 
 	for_each_gt(gt, i915, i) {
 		ddat_gt = hwmon->ddat_gt + i;
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-04-20 16:44 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-06  4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
2023-04-06  4:45 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit
2023-04-07 11:08   ` Rodrigo Vivi
2023-04-06  4:45 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
2023-04-06  9:16   ` kernel test robot
2023-04-07 11:08   ` Rodrigo Vivi
2023-04-10 22:17     ` Dixit, Ashutosh
2023-04-06  4:45 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit
2023-04-07 11:04   ` Rodrigo Vivi
2023-04-10 22:40     ` Dixit, Ashutosh
2023-04-06  5:15 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware Patchwork
2023-04-06  5:32 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-04-06 17:42 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2023-04-10 22:35 [Intel-gfx] [PATCH 0/3] " Ashutosh Dixit
2023-04-10 22:35 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit
2023-04-18  5:35   ` Rodrigo Vivi
2023-04-18 17:23     ` Dixit, Ashutosh
2023-04-19 19:40       ` Rodrigo Vivi
2023-04-19 22:13         ` Dixit, Ashutosh
2023-04-20 15:45           ` Rodrigo Vivi
2023-04-19 13:21   ` Tvrtko Ursulin
2023-04-19 22:10     ` Dixit, Ashutosh
2023-04-20  7:57       ` Tvrtko Ursulin
2023-04-20 15:43         ` Rodrigo Vivi
2023-04-20 16:26           ` Dixit, Ashutosh
2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
2023-04-20 16:40 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).