intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit
@ 2023-03-31  2:26 Ashutosh Dixit
  2023-03-31  3:16 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/hwmon: Use 0 to designate disabled PL1 power limit (rev2) Patchwork
  2023-03-31 10:23 ` [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit Tvrtko Ursulin
  0 siblings, 2 replies; 4+ messages in thread
From: Ashutosh Dixit @ 2023-03-31  2:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi

On ATSM the PL1 limit is disabled at power up. The previous uapi assumed
that the PL1 limit is always enabled and therefore did not have a notion of
a disabled PL1 limit. This results in erroneous PL1 limit values when the
PL1 limit is disabled. For example at power up, the disabled ATSM PL1 limit
was previously shown as 0 which means a low PL1 limit whereas the limit
being disabled actually implies a high effective PL1 limit value.

To get round this problem, the PL1 limit uapi is expanded to include a
special value 0 to designate a disabled PL1 limit. A read value of 0 means
that the PL1 power limit is disabled, writing 0 disables the limit.

The link between this patch and the bugs mentioned below is as follows:
* Because on ATSM the PL1 power limit is disabled on power up and there
  were no means to enable it, we previously implemented the means to
  enable the limit when the PL1 hwmon entry (power1_max) was written to.
* Now there is a IGT igt@i915_hwmon@hwmon_write which (a) reads orig value
  from all hwmon sysfs  (b) does a bunch of random writes and finally (c)
  restores the orig value read. On ATSM since the orig value is 0, when
  the IGT restores the 0 value, the PL1 limit is now enabled with a value
  of 0.
* PL1 limit of 0 implies a low PL1 limit which causes GPU freq to fall to
  100 MHz. This causes GuC FW load and several IGT's to start timing out
  and gives rise to these Intel CI bugs. After this patch, writing 0 would
  disable the PL1 limit instead of enabling it, avoiding the freq drop
  issue.

v2: Add explanation for bugs mentioned below (Rodrigo)

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8060
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 .../ABI/testing/sysfs-driver-intel-i915-hwmon |  4 +++-
 drivers/gpu/drm/i915/i915_hwmon.c             | 24 +++++++++++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
index 2d6a472eef885..8d7d8f05f6cd0 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
@@ -14,7 +14,9 @@ Description:	RW. Card reactive sustained  (PL1/Tau) power limit in microwatts.
 
 		The power controller will throttle the operating frequency
 		if the power averaged over a window (typically seconds)
-		exceeds this limit.
+		exceeds this limit. A read value of 0 means that the PL1
+		power limit is disabled, writing 0 disables the
+		limit. Writing values > 0 will enable the power limit.
 
 		Only supported for particular Intel i915 graphics platforms.
 
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
index 596dd2c070106..c099057888914 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -349,6 +349,8 @@ hwm_power_is_visible(const struct hwm_drvdata *ddat, u32 attr, int chan)
 	}
 }
 
+#define PL1_DISABLE 0
+
 /*
  * HW allows arbitrary PL1 limits to be set but silently clamps these values to
  * "typical but not guaranteed" min/max values in rg.pkg_power_sku. Follow the
@@ -362,6 +364,14 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
 	intel_wakeref_t wakeref;
 	u64 r, min, max;
 
+	/* Check if PL1 limit is disabled */
+	with_intel_runtime_pm(ddat->uncore->rpm, wakeref)
+		r = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
+	if (!(r & PKG_PWR_LIM_1_EN)) {
+		*val = PL1_DISABLE;
+		return 0;
+	}
+
 	*val = hwm_field_read_and_scale(ddat,
 					hwmon->rg.pkg_rapl_limit,
 					PKG_PWR_LIM_1,
@@ -385,8 +395,22 @@ static int
 hwm_power_max_write(struct hwm_drvdata *ddat, long val)
 {
 	struct i915_hwmon *hwmon = ddat->hwmon;
+	intel_wakeref_t wakeref;
 	u32 nval;
 
+	if (val == PL1_DISABLE) {
+		/* Disable PL1 limit */
+		hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
+						    PKG_PWR_LIM_1_EN, 0);
+
+		/* Verify, because PL1 limit cannot be disabled on all platforms */
+		with_intel_runtime_pm(ddat->uncore->rpm, wakeref)
+			nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
+		if (nval & PKG_PWR_LIM_1_EN)
+			return -EPERM;
+		return 0;
+	}
+
 	/* Computation in 64-bits to avoid overflow. Round to nearest. */
 	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
 	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/hwmon: Use 0 to designate disabled PL1 power limit (rev2)
  2023-03-31  2:26 [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit Ashutosh Dixit
@ 2023-03-31  3:16 ` Patchwork
  2023-03-31 10:23 ` [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit Tvrtko Ursulin
  1 sibling, 0 replies; 4+ messages in thread
From: Patchwork @ 2023-03-31  3:16 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 6836 bytes --]

== Series Details ==

Series: drm/i915/hwmon: Use 0 to designate disabled PL1 power limit (rev2)
URL   : https://patchwork.freedesktop.org/series/115749/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_12946 -> Patchwork_115749v2
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_115749v2 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_115749v2, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/index.html

Participating hosts (37 -> 37)
------------------------------

  Additional (1): fi-kbl-soraka 
  Missing    (1): fi-snb-2520m 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_115749v2:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@hangcheck:
    - bat-adlp-9:         [PASS][1] -> [ABORT][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12946/bat-adlp-9/igt@i915_selftest@live@hangcheck.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-adlp-9/igt@i915_selftest@live@hangcheck.html

  
Known issues
------------

  Here are the changes found in Patchwork_115749v2 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3@smem:
    - bat-rpls-1:         NOTRUN -> [ABORT][3] ([i915#6687] / [i915#7978])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-rpls-1/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@gem_huc_copy@huc-copy:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#2190])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/fi-kbl-soraka/igt@gem_huc_copy@huc-copy.html

  * igt@gem_lmem_swapping@basic:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#4613]) +3 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/fi-kbl-soraka/igt@gem_lmem_swapping@basic.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-kbl-soraka:      NOTRUN -> [DMESG-FAIL][6] ([i915#5334] / [i915#7872])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html

  * igt@i915_selftest@live@gt_pm:
    - fi-kbl-soraka:      NOTRUN -> [DMESG-FAIL][7] ([i915#1886])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@i915_selftest@live@slpc:
    - bat-rpls-2:         NOTRUN -> [DMESG-FAIL][8] ([i915#6367] / [i915#7913] / [i915#7996])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-rpls-2/igt@i915_selftest@live@slpc.html

  * igt@kms_chamelium_frames@hdmi-crc-fast:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][9] ([fdo#109271]) +16 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/fi-kbl-soraka/igt@kms_chamelium_frames@hdmi-crc-fast.html

  * igt@kms_chamelium_hpd@common-hpd-after-suspend:
    - bat-rpls-2:         NOTRUN -> [SKIP][10] ([i915#7828])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-rpls-2/igt@kms_chamelium_hpd@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@suspend-read-crc:
    - bat-rpls-2:         NOTRUN -> [SKIP][11] ([i915#1845])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-rpls-2/igt@kms_pipe_crc_basic@suspend-read-crc.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@gt_pm:
    - bat-rpls-2:         [DMESG-FAIL][12] ([i915#4258]) -> [PASS][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12946/bat-rpls-2/igt@i915_selftest@live@gt_pm.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-rpls-2/igt@i915_selftest@live@gt_pm.html

  * igt@i915_selftest@live@hangcheck:
    - bat-rpls-1:         [INCOMPLETE][14] ([i915#4983]) -> [PASS][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12946/bat-rpls-1/igt@i915_selftest@live@hangcheck.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-rpls-1/igt@i915_selftest@live@hangcheck.html

  * igt@i915_selftest@live@reset:
    - bat-rpls-2:         [ABORT][16] ([i915#4983] / [i915#7913]) -> [PASS][17]
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12946/bat-rpls-2/igt@i915_selftest@live@reset.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-rpls-2/igt@i915_selftest@live@reset.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1:
    - bat-dg2-8:          [FAIL][18] ([i915#7932]) -> [PASS][19]
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12946/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#4258]: https://gitlab.freedesktop.org/drm/intel/issues/4258
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6687]: https://gitlab.freedesktop.org/drm/intel/issues/6687
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#7872]: https://gitlab.freedesktop.org/drm/intel/issues/7872
  [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
  [i915#7932]: https://gitlab.freedesktop.org/drm/intel/issues/7932
  [i915#7978]: https://gitlab.freedesktop.org/drm/intel/issues/7978
  [i915#7996]: https://gitlab.freedesktop.org/drm/intel/issues/7996


Build changes
-------------

  * Linux: CI_DRM_12946 -> Patchwork_115749v2

  CI-20190529: 20190529
  CI_DRM_12946: 1a734c75a7bbe1169d90a9fb2b24b0345926a769 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7230: f0485204004305dd3ee8f8bbbb9c552e53a4e050 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_115749v2: 1a734c75a7bbe1169d90a9fb2b24b0345926a769 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

5c483153a4ef drm/i915/hwmon: Use 0 to designate disabled PL1 power limit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115749v2/index.html

[-- Attachment #2: Type: text/html, Size: 8060 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit
  2023-03-31  2:26 [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit Ashutosh Dixit
  2023-03-31  3:16 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/hwmon: Use 0 to designate disabled PL1 power limit (rev2) Patchwork
@ 2023-03-31 10:23 ` Tvrtko Ursulin
  2023-04-01  2:44   ` Dixit, Ashutosh
  1 sibling, 1 reply; 4+ messages in thread
From: Tvrtko Ursulin @ 2023-03-31 10:23 UTC (permalink / raw)
  To: Ashutosh Dixit, intel-gfx; +Cc: dri-devel, Rodrigo Vivi


On 31/03/2023 03:26, Ashutosh Dixit wrote:
> On ATSM the PL1 limit is disabled at power up. The previous uapi assumed
> that the PL1 limit is always enabled and therefore did not have a notion of
> a disabled PL1 limit. This results in erroneous PL1 limit values when the
> PL1 limit is disabled. For example at power up, the disabled ATSM PL1 limit
> was previously shown as 0 which means a low PL1 limit whereas the limit
> being disabled actually implies a high effective PL1 limit value.
> 
> To get round this problem, the PL1 limit uapi is expanded to include a
> special value 0 to designate a disabled PL1 limit. A read value of 0 means
> that the PL1 power limit is disabled, writing 0 disables the limit.
> 
> The link between this patch and the bugs mentioned below is as follows:
> * Because on ATSM the PL1 power limit is disabled on power up and there
>    were no means to enable it, we previously implemented the means to
>    enable the limit when the PL1 hwmon entry (power1_max) was written to.
> * Now there is a IGT igt@i915_hwmon@hwmon_write which (a) reads orig value
>    from all hwmon sysfs  (b) does a bunch of random writes and finally (c)
>    restores the orig value read. On ATSM since the orig value is 0, when
>    the IGT restores the 0 value, the PL1 limit is now enabled with a value
>    of 0.
> * PL1 limit of 0 implies a low PL1 limit which causes GPU freq to fall to
>    100 MHz. This causes GuC FW load and several IGT's to start timing out
>    and gives rise to these Intel CI bugs. After this patch, writing 0 would
>    disable the PL1 limit instead of enabling it, avoiding the freq drop
>    issue.
> 
> v2: Add explanation for bugs mentioned below (Rodrigo)
> 
> Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8060
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   .../ABI/testing/sysfs-driver-intel-i915-hwmon |  4 +++-
>   drivers/gpu/drm/i915/i915_hwmon.c             | 24 +++++++++++++++++++
>   2 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> index 2d6a472eef885..8d7d8f05f6cd0 100644
> --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
> @@ -14,7 +14,9 @@ Description:	RW. Card reactive sustained  (PL1/Tau) power limit in microwatts.
>   
>   		The power controller will throttle the operating frequency
>   		if the power averaged over a window (typically seconds)
> -		exceeds this limit.
> +		exceeds this limit. A read value of 0 means that the PL1
> +		power limit is disabled, writing 0 disables the
> +		limit. Writing values > 0 will enable the power limit.
>   
>   		Only supported for particular Intel i915 graphics platforms.
>   
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> index 596dd2c070106..c099057888914 100644
> --- a/drivers/gpu/drm/i915/i915_hwmon.c
> +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> @@ -349,6 +349,8 @@ hwm_power_is_visible(const struct hwm_drvdata *ddat, u32 attr, int chan)
>   	}
>   }
>   
> +#define PL1_DISABLE 0
> +
>   /*
>    * HW allows arbitrary PL1 limits to be set but silently clamps these values to
>    * "typical but not guaranteed" min/max values in rg.pkg_power_sku. Follow the
> @@ -362,6 +364,14 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val)
>   	intel_wakeref_t wakeref;
>   	u64 r, min, max;
>   
> +	/* Check if PL1 limit is disabled */
> +	with_intel_runtime_pm(ddat->uncore->rpm, wakeref)
> +		r = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
> +	if (!(r & PKG_PWR_LIM_1_EN)) {
> +		*val = PL1_DISABLE;
> +		return 0;
> +	}
> +
>   	*val = hwm_field_read_and_scale(ddat,
>   					hwmon->rg.pkg_rapl_limit,
>   					PKG_PWR_LIM_1,
> @@ -385,8 +395,22 @@ static int
>   hwm_power_max_write(struct hwm_drvdata *ddat, long val)
>   {
>   	struct i915_hwmon *hwmon = ddat->hwmon;
> +	intel_wakeref_t wakeref;
>   	u32 nval;
>   
> +	if (val == PL1_DISABLE) {
> +		/* Disable PL1 limit */
> +		hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
> +						    PKG_PWR_LIM_1_EN, 0);
> +
> +		/* Verify, because PL1 limit cannot be disabled on all platforms */

I think there is a race right here, since above grabbed and released the 
hwmon_lock, anyone can modify it at this point before the verification 
below. Not sure if any consequences worse than a wrong -EPERM are 
possible though.

Also, is EPERM correct for something hardware does not support? We 
usually say ENODEV for such things, IIRC at least.

Anyway, race looks easily solvable by holding the existing mutex and a 
single rpm ref for the whole rmw-r cycle.

Regards,

Tvrtko

> +		with_intel_runtime_pm(ddat->uncore->rpm, wakeref)
> +			nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
> +		if (nval & PKG_PWR_LIM_1_EN)
> +			return -EPERM;
> +		return 0;
> +	}
> +
>   	/* Computation in 64-bits to avoid overflow. Round to nearest. */
>   	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
>   	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit
  2023-03-31 10:23 ` [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit Tvrtko Ursulin
@ 2023-04-01  2:44   ` Dixit, Ashutosh
  0 siblings, 0 replies; 4+ messages in thread
From: Dixit, Ashutosh @ 2023-04-01  2:44 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

On Fri, 31 Mar 2023 03:23:33 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> > @@ -385,8 +395,22 @@ static int
> >   hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >   {
> >	struct i915_hwmon *hwmon = ddat->hwmon;
> > +	intel_wakeref_t wakeref;
> >	u32 nval;
> >   +	if (val == PL1_DISABLE) {
> > +		/* Disable PL1 limit */
> > +		hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit,
> > +						    PKG_PWR_LIM_1_EN, 0);
> > +
> > +		/* Verify, because PL1 limit cannot be disabled on all platforms */
>
> I think there is a race right here, since above grabbed and released the
> hwmon_lock, anyone can modify it at this point before the verification
> below. Not sure if any consequences worse than a wrong -EPERM are possible
> though.
>
> Also, is EPERM correct for something hardware does not support? We usually
> say ENODEV for such things, IIRC at least.

Changed to -ENODEV in v3.

> Anyway, race looks easily solvable by holding the existing mutex and a
> single rpm ref for the whole rmw-r cycle.

Fixed in v3, thanks for catching these.

Ashutosh

> > +		with_intel_runtime_pm(ddat->uncore->rpm, wakeref)
> > +			nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit);
> > +		if (nval & PKG_PWR_LIM_1_EN)
> > +			return -EPERM;
> > +		return 0;
> > +	}
> > +
> >	/* Computation in 64-bits to avoid overflow. Round to nearest. */
> >	nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER);
> >	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-04-01  2:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-31  2:26 [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit Ashutosh Dixit
2023-03-31  3:16 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/hwmon: Use 0 to designate disabled PL1 power limit (rev2) Patchwork
2023-03-31 10:23 ` [Intel-gfx] [PATCH v2] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit Tvrtko Ursulin
2023-04-01  2:44   ` Dixit, Ashutosh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).