linux-clk.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
@ 2022-10-05  9:06 Akhil P Oommen
  2022-10-05  9:06 ` [PATCH v7 1/6] dt-bindings: clk: qcom: Support gpu cx gdsc reset Akhil P Oommen
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-10-05  9:06 UTC (permalink / raw)
  To: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel
  Cc: Douglas Anderson, krzysztof.kozlowski, Akhil P Oommen,
	Abhinav Kumar, Andy Gross, Daniel Vetter, David Airlie,
	Konrad Dybcio, Krzysztof Kozlowski, Michael Turquette,
	Rob Herring, Sean Paul, Stephen Boyd, devicetree, linux-clk,
	linux-kernel


Some clients like adreno gpu driver would like to ensure that its gdsc
is collapsed at hardware during a gpu reset sequence. This is because it
has a votable gdsc which could be ON due to a vote from another subsystem
like tz, hyp etc or due to an internal hardware signal. To allow
this, gpucc driver can expose an interface to the client driver using
reset framework. Using this the client driver can trigger a polling within
the gdsc driver.

This series is rebased on top of qcom/linux:for-next branch.

Related discussion: https://patchwork.freedesktop.org/patch/493144/

Changes in v7:
- Update commit message (Bjorn)
- Rebased on top of qcom/linux:for-next branch.

Changes in v6:
- No code changes in this version. Just captured the Acked-by tags

Changes in v5:
- Nit: Remove a duplicate blank line (Krzysztof)

Changes in v4:
- Update gpu dt-binding schema
- Typo fix in commit text

Changes in v3:
- Use pointer to const for "struct qcom_reset_ops" in qcom_reset_map (Krzysztof)

Changes in v2:
- Return error when a particular custom reset op is not implemented. (Dmitry)

Akhil P Oommen (6):
  dt-bindings: clk: qcom: Support gpu cx gdsc reset
  clk: qcom: Allow custom reset ops
  clk: qcom: gdsc: Add a reset op to poll gdsc collapse
  clk: qcom: gpucc-sc7280: Add cx collapse reset support
  dt-bindings: drm/msm/gpu: Add optional resets
  arm64: dts: qcom: sc7280: Add Reset support for gpu

 .../devicetree/bindings/display/msm/gpu.yaml       |  6 +++++
 arch/arm64/boot/dts/qcom/sc7280.dtsi               |  3 +++
 drivers/clk/qcom/gdsc.c                            | 23 ++++++++++++++----
 drivers/clk/qcom/gdsc.h                            |  7 ++++++
 drivers/clk/qcom/gpucc-sc7280.c                    | 10 ++++++++
 drivers/clk/qcom/reset.c                           | 27 +++++++++++++++++++++-
 drivers/clk/qcom/reset.h                           |  8 +++++++
 include/dt-bindings/clock/qcom,gpucc-sc7280.h      |  3 +++
 8 files changed, 82 insertions(+), 5 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v7 1/6] dt-bindings: clk: qcom: Support gpu cx gdsc reset
  2022-10-05  9:06 [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
@ 2022-10-05  9:06 ` Akhil P Oommen
  2022-10-05  9:07 ` [PATCH v7 2/6] clk: qcom: Allow custom reset ops Akhil P Oommen
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-10-05  9:06 UTC (permalink / raw)
  To: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel
  Cc: Douglas Anderson, krzysztof.kozlowski, Akhil P Oommen,
	Andy Gross, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Stephen Boyd, devicetree,
	linux-clk, linux-kernel

Add necessary definitions in gpucc bindings to ensure gpu cx gdsc collapse
through 'reset' framework for SC7280.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---

(no changes since v1)

 include/dt-bindings/clock/qcom,gpucc-sc7280.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/dt-bindings/clock/qcom,gpucc-sc7280.h b/include/dt-bindings/clock/qcom,gpucc-sc7280.h
index 669b23b..843a31b 100644
--- a/include/dt-bindings/clock/qcom,gpucc-sc7280.h
+++ b/include/dt-bindings/clock/qcom,gpucc-sc7280.h
@@ -32,4 +32,7 @@
 #define GPU_CC_CX_GDSC				0
 #define GPU_CC_GX_GDSC				1
 
+/* GPU_CC reset IDs */
+#define GPU_CX_COLLAPSE				0
+
 #endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 2/6] clk: qcom: Allow custom reset ops
  2022-10-05  9:06 [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
  2022-10-05  9:06 ` [PATCH v7 1/6] dt-bindings: clk: qcom: Support gpu cx gdsc reset Akhil P Oommen
@ 2022-10-05  9:07 ` Akhil P Oommen
  2022-10-05  9:07 ` [PATCH v7 3/6] clk: qcom: gdsc: Add a reset op to poll gdsc collapse Akhil P Oommen
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-10-05  9:07 UTC (permalink / raw)
  To: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel
  Cc: Douglas Anderson, krzysztof.kozlowski, Akhil P Oommen,
	Andy Gross, Konrad Dybcio, Michael Turquette, Stephen Boyd,
	linux-clk, linux-kernel

Allow soc specific clk drivers to specify a custom reset operation. We
will use this in an upcoming patch to allow gpucc driver to specify a
differet reset operation for cx_gdsc.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---

(no changes since v3)

Changes in v3:
- Use pointer to const for "struct qcom_reset_ops" in qcom_reset_map (Krzysztof)

Changes in v2:
- Return error when a particular custom reset op is not implemented. (Dmitry)

 drivers/clk/qcom/reset.c | 27 ++++++++++++++++++++++++++-
 drivers/clk/qcom/reset.h |  8 ++++++++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/qcom/reset.c b/drivers/clk/qcom/reset.c
index 2a16adb..10ef71b 100644
--- a/drivers/clk/qcom/reset.c
+++ b/drivers/clk/qcom/reset.c
@@ -13,7 +13,20 @@
 
 static int qcom_reset(struct reset_controller_dev *rcdev, unsigned long id)
 {
-	struct qcom_reset_controller *rst = to_qcom_reset_controller(rcdev);
+	struct qcom_reset_controller *rst;
+	const struct qcom_reset_map *map;
+
+	rst = to_qcom_reset_controller(rcdev);
+	map = &rst->reset_map[id];
+
+	if (map->ops && map->ops->reset)
+		return map->ops->reset(map->priv);
+	/*
+	 * If custom ops is implemented but just not this callback, return
+	 * error
+	 */
+	else if (map->ops)
+		return -EOPNOTSUPP;
 
 	rcdev->ops->assert(rcdev, id);
 	udelay(rst->reset_map[id].udelay ?: 1); /* use 1 us as default */
@@ -30,6 +43,12 @@ qcom_reset_assert(struct reset_controller_dev *rcdev, unsigned long id)
 
 	rst = to_qcom_reset_controller(rcdev);
 	map = &rst->reset_map[id];
+
+	if (map->ops && map->ops->assert)
+		return map->ops->assert(map->priv);
+	else if (map->ops)
+		return -EOPNOTSUPP;
+
 	mask = BIT(map->bit);
 
 	return regmap_update_bits(rst->regmap, map->reg, mask, mask);
@@ -44,6 +63,12 @@ qcom_reset_deassert(struct reset_controller_dev *rcdev, unsigned long id)
 
 	rst = to_qcom_reset_controller(rcdev);
 	map = &rst->reset_map[id];
+
+	if (map->ops && map->ops->deassert)
+		return map->ops->deassert(map->priv);
+	else if (map->ops)
+		return -EOPNOTSUPP;
+
 	mask = BIT(map->bit);
 
 	return regmap_update_bits(rst->regmap, map->reg, mask, 0);
diff --git a/drivers/clk/qcom/reset.h b/drivers/clk/qcom/reset.h
index b8c1135..a4d767b 100644
--- a/drivers/clk/qcom/reset.h
+++ b/drivers/clk/qcom/reset.h
@@ -8,10 +8,18 @@
 
 #include <linux/reset-controller.h>
 
+struct qcom_reset_ops {
+	int (*reset)(void *priv);
+	int (*assert)(void *priv);
+	int (*deassert)(void *priv);
+};
+
 struct qcom_reset_map {
 	unsigned int reg;
 	u8 bit;
 	u8 udelay;
+	const struct qcom_reset_ops *ops;
+	void *priv;
 };
 
 struct regmap;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 3/6] clk: qcom: gdsc: Add a reset op to poll gdsc collapse
  2022-10-05  9:06 [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
  2022-10-05  9:06 ` [PATCH v7 1/6] dt-bindings: clk: qcom: Support gpu cx gdsc reset Akhil P Oommen
  2022-10-05  9:07 ` [PATCH v7 2/6] clk: qcom: Allow custom reset ops Akhil P Oommen
@ 2022-10-05  9:07 ` Akhil P Oommen
  2022-12-07 15:45   ` Ulf Hansson
  2022-10-05  9:07 ` [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support Akhil P Oommen
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Akhil P Oommen @ 2022-10-05  9:07 UTC (permalink / raw)
  To: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel
  Cc: Douglas Anderson, krzysztof.kozlowski, Akhil P Oommen,
	Andy Gross, Konrad Dybcio, Michael Turquette, Stephen Boyd,
	linux-clk, linux-kernel

Add a reset op compatible function to poll for gdsc collapse. This is
required because:
  1. We don't wait for it to turn OFF at hardware for VOTABLE GDSCs.
  2. There is no way for client drivers (eg. gpu driver) to do
  put-with-wait for these gdscs which is required in some scenarios
  (eg. GPU recovery).

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---

Changes in v7:
- Update commit message (Bjorn)

Changes in v2:
- Minor update to function prototype

 drivers/clk/qcom/gdsc.c | 23 +++++++++++++++++++----
 drivers/clk/qcom/gdsc.h |  7 +++++++
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index 7cf5e13..ccef742 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -17,6 +17,7 @@
 #include <linux/reset-controller.h>
 #include <linux/slab.h>
 #include "gdsc.h"
+#include "reset.h"
 
 #define PWR_ON_MASK		BIT(31)
 #define EN_REST_WAIT_MASK	GENMASK_ULL(23, 20)
@@ -116,7 +117,8 @@ static int gdsc_hwctrl(struct gdsc *sc, bool en)
 	return regmap_update_bits(sc->regmap, sc->gdscr, HW_CONTROL_MASK, val);
 }
 
-static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status)
+static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status,
+		s64 timeout_us, unsigned int interval_ms)
 {
 	ktime_t start;
 
@@ -124,7 +126,9 @@ static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status)
 	do {
 		if (gdsc_check_status(sc, status))
 			return 0;
-	} while (ktime_us_delta(ktime_get(), start) < TIMEOUT_US);
+		if (interval_ms)
+			msleep(interval_ms);
+	} while (ktime_us_delta(ktime_get(), start) < timeout_us);
 
 	if (gdsc_check_status(sc, status))
 		return 0;
@@ -189,7 +193,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
 		udelay(1);
 	}
 
-	ret = gdsc_poll_status(sc, status);
+	ret = gdsc_poll_status(sc, status, TIMEOUT_US, 0);
 	WARN(ret, "%s status stuck at 'o%s'", sc->pd.name, status ? "ff" : "n");
 
 	if (!ret && status == GDSC_OFF && sc->rsupply) {
@@ -360,7 +364,7 @@ static int _gdsc_disable(struct gdsc *sc)
 		 */
 		udelay(1);
 
-		ret = gdsc_poll_status(sc, GDSC_ON);
+		ret = gdsc_poll_status(sc, GDSC_ON, TIMEOUT_US, 0);
 		if (ret)
 			return ret;
 	}
@@ -608,3 +612,14 @@ int gdsc_gx_do_nothing_enable(struct generic_pm_domain *domain)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(gdsc_gx_do_nothing_enable);
+
+int gdsc_wait_for_collapse(void *priv)
+{
+	struct gdsc *sc = priv;
+	int ret;
+
+	ret = gdsc_poll_status(sc, GDSC_OFF, 500000, 5);
+	WARN(ret, "%s status stuck at 'on'", sc->pd.name);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(gdsc_wait_for_collapse);
diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h
index 981a12c..5395f69 100644
--- a/drivers/clk/qcom/gdsc.h
+++ b/drivers/clk/qcom/gdsc.h
@@ -12,6 +12,7 @@
 struct regmap;
 struct regulator;
 struct reset_controller_dev;
+struct qcom_reset_map;
 
 /**
  * struct gdsc - Globally Distributed Switch Controller
@@ -88,6 +89,7 @@ int gdsc_register(struct gdsc_desc *desc, struct reset_controller_dev *,
 		  struct regmap *);
 void gdsc_unregister(struct gdsc_desc *desc);
 int gdsc_gx_do_nothing_enable(struct generic_pm_domain *domain);
+int gdsc_wait_for_collapse(void *priv);
 #else
 static inline int gdsc_register(struct gdsc_desc *desc,
 				struct reset_controller_dev *rcdev,
@@ -97,5 +99,10 @@ static inline int gdsc_register(struct gdsc_desc *desc,
 }
 
 static inline void gdsc_unregister(struct gdsc_desc *desc) {};
+
+static int gdsc_wait_for_collapse(void *priv)
+{
+	return  -ENOSYS;
+}
 #endif /* CONFIG_QCOM_GDSC */
 #endif /* __QCOM_GDSC_H__ */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support
  2022-10-05  9:06 [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
                   ` (2 preceding siblings ...)
  2022-10-05  9:07 ` [PATCH v7 3/6] clk: qcom: gdsc: Add a reset op to poll gdsc collapse Akhil P Oommen
@ 2022-10-05  9:07 ` Akhil P Oommen
  2022-12-07 15:46   ` Ulf Hansson
  2022-11-07 16:54 ` [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
  2022-12-01 22:57 ` Bjorn Andersson
  5 siblings, 1 reply; 27+ messages in thread
From: Akhil P Oommen @ 2022-10-05  9:07 UTC (permalink / raw)
  To: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel
  Cc: Douglas Anderson, krzysztof.kozlowski, Akhil P Oommen,
	Andy Gross, Konrad Dybcio, Michael Turquette, Stephen Boyd,
	linux-clk, linux-kernel

Allow a consumer driver to poll for cx gdsc collapse through Reset
framework.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---

(no changes since v3)

Changes in v3:
- Convert 'struct qcom_reset_ops cx_gdsc_reset' to 'static const' (Krzysztof)

Changes in v2:
- Minor update to use the updated custom reset ops implementation

 drivers/clk/qcom/gpucc-sc7280.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/clk/qcom/gpucc-sc7280.c b/drivers/clk/qcom/gpucc-sc7280.c
index 9a832f2..fece3f4 100644
--- a/drivers/clk/qcom/gpucc-sc7280.c
+++ b/drivers/clk/qcom/gpucc-sc7280.c
@@ -433,12 +433,22 @@ static const struct regmap_config gpu_cc_sc7280_regmap_config = {
 	.fast_io = true,
 };
 
+static const struct qcom_reset_ops cx_gdsc_reset = {
+	.reset = gdsc_wait_for_collapse,
+};
+
+static const struct qcom_reset_map gpucc_sc7280_resets[] = {
+	[GPU_CX_COLLAPSE] = { .ops = &cx_gdsc_reset, .priv = &cx_gdsc },
+};
+
 static const struct qcom_cc_desc gpu_cc_sc7280_desc = {
 	.config = &gpu_cc_sc7280_regmap_config,
 	.clks = gpu_cc_sc7280_clocks,
 	.num_clks = ARRAY_SIZE(gpu_cc_sc7280_clocks),
 	.gdscs = gpu_cc_sc7180_gdscs,
 	.num_gdscs = ARRAY_SIZE(gpu_cc_sc7180_gdscs),
+	.resets = gpucc_sc7280_resets,
+	.num_resets = ARRAY_SIZE(gpucc_sc7280_resets),
 };
 
 static const struct of_device_id gpu_cc_sc7280_match_table[] = {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-10-05  9:06 [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
                   ` (3 preceding siblings ...)
  2022-10-05  9:07 ` [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support Akhil P Oommen
@ 2022-11-07 16:54 ` Akhil P Oommen
  2022-12-01 22:57 ` Bjorn Andersson
  5 siblings, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-11-07 16:54 UTC (permalink / raw)
  To: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel
  Cc: Douglas Anderson, krzysztof.kozlowski, Abhinav Kumar, Andy Gross,
	Daniel Vetter, David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On 10/5/2022 2:36 PM, Akhil P Oommen wrote:
> Some clients like adreno gpu driver would like to ensure that its gdsc
> is collapsed at hardware during a gpu reset sequence. This is because it
> has a votable gdsc which could be ON due to a vote from another subsystem
> like tz, hyp etc or due to an internal hardware signal. To allow
> this, gpucc driver can expose an interface to the client driver using
> reset framework. Using this the client driver can trigger a polling within
> the gdsc driver.
>
> This series is rebased on top of qcom/linux:for-next branch.
>
> Related discussion: https://patchwork.freedesktop.org/patch/493144/
>
> Changes in v7:
> - Update commit message (Bjorn)
> - Rebased on top of qcom/linux:for-next branch.
>
> Changes in v6:
> - No code changes in this version. Just captured the Acked-by tags
>
> Changes in v5:
> - Nit: Remove a duplicate blank line (Krzysztof)
>
> Changes in v4:
> - Update gpu dt-binding schema
> - Typo fix in commit text
>
> Changes in v3:
> - Use pointer to const for "struct qcom_reset_ops" in qcom_reset_map (Krzysztof)
>
> Changes in v2:
> - Return error when a particular custom reset op is not implemented. (Dmitry)
>
> Akhil P Oommen (6):
>    dt-bindings: clk: qcom: Support gpu cx gdsc reset
>    clk: qcom: Allow custom reset ops
>    clk: qcom: gdsc: Add a reset op to poll gdsc collapse
>    clk: qcom: gpucc-sc7280: Add cx collapse reset support
>    dt-bindings: drm/msm/gpu: Add optional resets
>    arm64: dts: qcom: sc7280: Add Reset support for gpu
>
>   .../devicetree/bindings/display/msm/gpu.yaml       |  6 +++++
>   arch/arm64/boot/dts/qcom/sc7280.dtsi               |  3 +++
>   drivers/clk/qcom/gdsc.c                            | 23 ++++++++++++++----
>   drivers/clk/qcom/gdsc.h                            |  7 ++++++
>   drivers/clk/qcom/gpucc-sc7280.c                    | 10 ++++++++
>   drivers/clk/qcom/reset.c                           | 27 +++++++++++++++++++++-
>   drivers/clk/qcom/reset.h                           |  8 +++++++
>   include/dt-bindings/clock/qcom,gpucc-sc7280.h      |  3 +++
>   8 files changed, 82 insertions(+), 5 deletions(-)
>
Bjorn,

The latest patchset has been in the mailing list for over a month now. 
Could you please share how soon we can pick this? That will give me some 
confidence to pull these patches into our chromeos kernel tree ASAP.

-Akhil.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-10-05  9:06 [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
                   ` (4 preceding siblings ...)
  2022-11-07 16:54 ` [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
@ 2022-12-01 22:57 ` Bjorn Andersson
  2022-12-02  7:00   ` [Freedreno] " Akhil P Oommen
  2022-12-07 16:00   ` Ulf Hansson
  5 siblings, 2 replies; 27+ messages in thread
From: Bjorn Andersson @ 2022-12-01 22:57 UTC (permalink / raw)
  To: Akhil P Oommen, Ulf Hansson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> 

@Ulf, Akhil has a power-domain for a piece of hardware which may be
voted active by multiple different subsystems (co-processors/execution
contexts) in the system.

As such, during the powering down sequence we don't wait for the
power-domain to turn off. But in the event of an error, the recovery
mechanism relies on waiting for the hardware to settle in a powered off
state.

The proposal here is to use the reset framework to wait for this state
to be reached, before continuing with the recovery mechanism in the
client driver.

Given our other discussions on quirky behavior, do you have any
input/suggestions on this?

> Some clients like adreno gpu driver would like to ensure that its gdsc
> is collapsed at hardware during a gpu reset sequence. This is because it
> has a votable gdsc which could be ON due to a vote from another subsystem
> like tz, hyp etc or due to an internal hardware signal. To allow
> this, gpucc driver can expose an interface to the client driver using
> reset framework. Using this the client driver can trigger a polling within
> the gdsc driver.

@Akhil, this description is fairly generic. As we've reached the state
where the hardware has settled and we return to the client, what
prevents it from being powered up again?

Or is it simply a question of it hitting the powered-off state, not
necessarily staying there?

Regards,
Bjorn

> 
> This series is rebased on top of qcom/linux:for-next branch.
> 
> Related discussion: https://patchwork.freedesktop.org/patch/493144/
> 
> Changes in v7:
> - Update commit message (Bjorn)
> - Rebased on top of qcom/linux:for-next branch.
> 
> Changes in v6:
> - No code changes in this version. Just captured the Acked-by tags
> 
> Changes in v5:
> - Nit: Remove a duplicate blank line (Krzysztof)
> 
> Changes in v4:
> - Update gpu dt-binding schema
> - Typo fix in commit text
> 
> Changes in v3:
> - Use pointer to const for "struct qcom_reset_ops" in qcom_reset_map (Krzysztof)
> 
> Changes in v2:
> - Return error when a particular custom reset op is not implemented. (Dmitry)
> 
> Akhil P Oommen (6):
>   dt-bindings: clk: qcom: Support gpu cx gdsc reset
>   clk: qcom: Allow custom reset ops
>   clk: qcom: gdsc: Add a reset op to poll gdsc collapse
>   clk: qcom: gpucc-sc7280: Add cx collapse reset support
>   dt-bindings: drm/msm/gpu: Add optional resets
>   arm64: dts: qcom: sc7280: Add Reset support for gpu
> 
>  .../devicetree/bindings/display/msm/gpu.yaml       |  6 +++++
>  arch/arm64/boot/dts/qcom/sc7280.dtsi               |  3 +++
>  drivers/clk/qcom/gdsc.c                            | 23 ++++++++++++++----
>  drivers/clk/qcom/gdsc.h                            |  7 ++++++
>  drivers/clk/qcom/gpucc-sc7280.c                    | 10 ++++++++
>  drivers/clk/qcom/reset.c                           | 27 +++++++++++++++++++++-
>  drivers/clk/qcom/reset.h                           |  8 +++++++
>  include/dt-bindings/clock/qcom,gpucc-sc7280.h      |  3 +++
>  8 files changed, 82 insertions(+), 5 deletions(-)
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Freedreno] [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-01 22:57 ` Bjorn Andersson
@ 2022-12-02  7:00   ` Akhil P Oommen
  2022-12-06 19:58     ` Akhil P Oommen
  2022-12-07 16:00   ` Ulf Hansson
  1 sibling, 1 reply; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-02  7:00 UTC (permalink / raw)
  To: Bjorn Andersson, Ulf Hansson
  Cc: David Airlie, Michael Turquette, Konrad Dybcio, dri-devel,
	linux-kernel, Krzysztof Kozlowski, linux-clk, Rob Clark,
	Andy Gross, devicetree, Philipp Zabel, linux-arm-msm,
	Abhinav Kumar, Stephen Boyd, Rob Herring, Sean Paul,
	Stephen Boyd, Douglas Anderson, krzysztof.kozlowski,
	Daniel Vetter, Dmitry Baryshkov, freedreno

On 12/2/2022 4:27 AM, Bjorn Andersson wrote:
> On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> @Ulf, Akhil has a power-domain for a piece of hardware which may be
> voted active by multiple different subsystems (co-processors/execution
> contexts) in the system.
>
> As such, during the powering down sequence we don't wait for the
> power-domain to turn off. But in the event of an error, the recovery
> mechanism relies on waiting for the hardware to settle in a powered off
> state.
>
> The proposal here is to use the reset framework to wait for this state
> to be reached, before continuing with the recovery mechanism in the
> client driver.
>
> Given our other discussions on quirky behavior, do you have any
> input/suggestions on this?
>
>> Some clients like adreno gpu driver would like to ensure that its gdsc
>> is collapsed at hardware during a gpu reset sequence. This is because it
>> has a votable gdsc which could be ON due to a vote from another subsystem
>> like tz, hyp etc or due to an internal hardware signal. To allow
>> this, gpucc driver can expose an interface to the client driver using
>> reset framework. Using this the client driver can trigger a polling within
>> the gdsc driver.
> @Akhil, this description is fairly generic. As we've reached the state
> where the hardware has settled and we return to the client, what
> prevents it from being powered up again?
>
> Or is it simply a question of it hitting the powered-off state, not
> necessarily staying there?
Correct. It doesn't need to stay there. The intention is to hit the powered-off state at least once to clear all the internal hw states (basically a hw reset).

-Akhil.
>
> Regards,
> Bjorn
>
>> This series is rebased on top of qcom/linux:for-next branch.
>>
>> Related discussion: https://patchwork.freedesktop.org/patch/493144/
>>
>> Changes in v7:
>> - Update commit message (Bjorn)
>> - Rebased on top of qcom/linux:for-next branch.
>>
>> Changes in v6:
>> - No code changes in this version. Just captured the Acked-by tags
>>
>> Changes in v5:
>> - Nit: Remove a duplicate blank line (Krzysztof)
>>
>> Changes in v4:
>> - Update gpu dt-binding schema
>> - Typo fix in commit text
>>
>> Changes in v3:
>> - Use pointer to const for "struct qcom_reset_ops" in qcom_reset_map (Krzysztof)
>>
>> Changes in v2:
>> - Return error when a particular custom reset op is not implemented. (Dmitry)
>>
>> Akhil P Oommen (6):
>>   dt-bindings: clk: qcom: Support gpu cx gdsc reset
>>   clk: qcom: Allow custom reset ops
>>   clk: qcom: gdsc: Add a reset op to poll gdsc collapse
>>   clk: qcom: gpucc-sc7280: Add cx collapse reset support
>>   dt-bindings: drm/msm/gpu: Add optional resets
>>   arm64: dts: qcom: sc7280: Add Reset support for gpu
>>
>>  .../devicetree/bindings/display/msm/gpu.yaml       |  6 +++++
>>  arch/arm64/boot/dts/qcom/sc7280.dtsi               |  3 +++
>>  drivers/clk/qcom/gdsc.c                            | 23 ++++++++++++++----
>>  drivers/clk/qcom/gdsc.h                            |  7 ++++++
>>  drivers/clk/qcom/gpucc-sc7280.c                    | 10 ++++++++
>>  drivers/clk/qcom/reset.c                           | 27 +++++++++++++++++++++-
>>  drivers/clk/qcom/reset.h                           |  8 +++++++
>>  include/dt-bindings/clock/qcom,gpucc-sc7280.h      |  3 +++
>>  8 files changed, 82 insertions(+), 5 deletions(-)
>>
>> -- 
>> 2.7.4
>>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Freedreno] [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-02  7:00   ` [Freedreno] " Akhil P Oommen
@ 2022-12-06 19:58     ` Akhil P Oommen
  0 siblings, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-06 19:58 UTC (permalink / raw)
  To: Bjorn Andersson, Ulf Hansson
  Cc: David Airlie, Michael Turquette, Konrad Dybcio, dri-devel,
	linux-kernel, Krzysztof Kozlowski, linux-clk, Rob Clark,
	Andy Gross, devicetree, Philipp Zabel, linux-arm-msm,
	Abhinav Kumar, Stephen Boyd, Rob Herring, Sean Paul,
	Stephen Boyd, Douglas Anderson, krzysztof.kozlowski,
	Daniel Vetter, Dmitry Baryshkov, freedreno

On 12/2/2022 12:30 PM, Akhil P Oommen wrote:
> On 12/2/2022 4:27 AM, Bjorn Andersson wrote:
>> On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
>> @Ulf, Akhil has a power-domain for a piece of hardware which may be
>> voted active by multiple different subsystems (co-processors/execution
>> contexts) in the system.
>>
>> As such, during the powering down sequence we don't wait for the
>> power-domain to turn off. But in the event of an error, the recovery
>> mechanism relies on waiting for the hardware to settle in a powered off
>> state.
>>
>> The proposal here is to use the reset framework to wait for this state
>> to be reached, before continuing with the recovery mechanism in the
>> client driver.
>>
>> Given our other discussions on quirky behavior, do you have any
>> input/suggestions on this?
Ulf,

Gentle ping! Could you please share your feedback?

-Akhil.
>>
>>> Some clients like adreno gpu driver would like to ensure that its gdsc
>>> is collapsed at hardware during a gpu reset sequence. This is because it
>>> has a votable gdsc which could be ON due to a vote from another subsystem
>>> like tz, hyp etc or due to an internal hardware signal. To allow
>>> this, gpucc driver can expose an interface to the client driver using
>>> reset framework. Using this the client driver can trigger a polling within
>>> the gdsc driver.
>> @Akhil, this description is fairly generic. As we've reached the state
>> where the hardware has settled and we return to the client, what
>> prevents it from being powered up again?
>>
>> Or is it simply a question of it hitting the powered-off state, not
>> necessarily staying there?
> Correct. It doesn't need to stay there. The intention is to hit the powered-off state at least once to clear all the internal hw states (basically a hw reset).
>
> -Akhil.
>> Regards,
>> Bjorn
>>
>>> This series is rebased on top of qcom/linux:for-next branch.
>>>
>>> Related discussion: https://patchwork.freedesktop.org/patch/493144/
>>>
>>> Changes in v7:
>>> - Update commit message (Bjorn)
>>> - Rebased on top of qcom/linux:for-next branch.
>>>
>>> Changes in v6:
>>> - No code changes in this version. Just captured the Acked-by tags
>>>
>>> Changes in v5:
>>> - Nit: Remove a duplicate blank line (Krzysztof)
>>>
>>> Changes in v4:
>>> - Update gpu dt-binding schema
>>> - Typo fix in commit text
>>>
>>> Changes in v3:
>>> - Use pointer to const for "struct qcom_reset_ops" in qcom_reset_map (Krzysztof)
>>>
>>> Changes in v2:
>>> - Return error when a particular custom reset op is not implemented. (Dmitry)
>>>
>>> Akhil P Oommen (6):
>>>   dt-bindings: clk: qcom: Support gpu cx gdsc reset
>>>   clk: qcom: Allow custom reset ops
>>>   clk: qcom: gdsc: Add a reset op to poll gdsc collapse
>>>   clk: qcom: gpucc-sc7280: Add cx collapse reset support
>>>   dt-bindings: drm/msm/gpu: Add optional resets
>>>   arm64: dts: qcom: sc7280: Add Reset support for gpu
>>>
>>>  .../devicetree/bindings/display/msm/gpu.yaml       |  6 +++++
>>>  arch/arm64/boot/dts/qcom/sc7280.dtsi               |  3 +++
>>>  drivers/clk/qcom/gdsc.c                            | 23 ++++++++++++++----
>>>  drivers/clk/qcom/gdsc.h                            |  7 ++++++
>>>  drivers/clk/qcom/gpucc-sc7280.c                    | 10 ++++++++
>>>  drivers/clk/qcom/reset.c                           | 27 +++++++++++++++++++++-
>>>  drivers/clk/qcom/reset.h                           |  8 +++++++
>>>  include/dt-bindings/clock/qcom,gpucc-sc7280.h      |  3 +++
>>>  8 files changed, 82 insertions(+), 5 deletions(-)
>>>
>>> -- 
>>> 2.7.4
>>>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/6] clk: qcom: gdsc: Add a reset op to poll gdsc collapse
  2022-10-05  9:07 ` [PATCH v7 3/6] clk: qcom: gdsc: Add a reset op to poll gdsc collapse Akhil P Oommen
@ 2022-12-07 15:45   ` Ulf Hansson
  2022-12-08 15:02     ` Akhil P Oommen
  0 siblings, 1 reply; 27+ messages in thread
From: Ulf Hansson @ 2022-12-07 15:45 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Andy Gross, Konrad Dybcio,
	Michael Turquette, Stephen Boyd, linux-clk, linux-kernel

On Wed, 5 Oct 2022 at 11:08, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
>
> Add a reset op compatible function to poll for gdsc collapse. This is
> required because:
>   1. We don't wait for it to turn OFF at hardware for VOTABLE GDSCs.
>   2. There is no way for client drivers (eg. gpu driver) to do
>   put-with-wait for these gdscs which is required in some scenarios
>   (eg. GPU recovery).

What puzzles me a bit, who is the typical consumer of the reset.

I looked at patch4 and tried to figure it out, but let's discuss that
in that thread instead. Some more comments, see below.

>
> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>
> Changes in v7:
> - Update commit message (Bjorn)
>
> Changes in v2:
> - Minor update to function prototype
>
>  drivers/clk/qcom/gdsc.c | 23 +++++++++++++++++++----
>  drivers/clk/qcom/gdsc.h |  7 +++++++
>  2 files changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
> index 7cf5e13..ccef742 100644
> --- a/drivers/clk/qcom/gdsc.c
> +++ b/drivers/clk/qcom/gdsc.c
> @@ -17,6 +17,7 @@
>  #include <linux/reset-controller.h>
>  #include <linux/slab.h>
>  #include "gdsc.h"
> +#include "reset.h"
>
>  #define PWR_ON_MASK            BIT(31)
>  #define EN_REST_WAIT_MASK      GENMASK_ULL(23, 20)
> @@ -116,7 +117,8 @@ static int gdsc_hwctrl(struct gdsc *sc, bool en)
>         return regmap_update_bits(sc->regmap, sc->gdscr, HW_CONTROL_MASK, val);
>  }
>
> -static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status)
> +static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status,
> +               s64 timeout_us, unsigned int interval_ms)
>  {
>         ktime_t start;
>
> @@ -124,7 +126,9 @@ static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status)
>         do {
>                 if (gdsc_check_status(sc, status))
>                         return 0;
> -       } while (ktime_us_delta(ktime_get(), start) < TIMEOUT_US);
> +               if (interval_ms)
> +                       msleep(interval_ms);
> +       } while (ktime_us_delta(ktime_get(), start) < timeout_us);

Rather than continue to open code this polling loop, would it not make
sense to convert the code into using readx_poll_timeout() (or some of
its friends).

Down the road, this leads to that the msleep() above should become
usleep_range() instead, which seems more correct to me.

>
>         if (gdsc_check_status(sc, status))
>                 return 0;
> @@ -189,7 +193,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
>                 udelay(1);
>         }
>
> -       ret = gdsc_poll_status(sc, status);
> +       ret = gdsc_poll_status(sc, status, TIMEOUT_US, 0);
>         WARN(ret, "%s status stuck at 'o%s'", sc->pd.name, status ? "ff" : "n");
>
>         if (!ret && status == GDSC_OFF && sc->rsupply) {
> @@ -360,7 +364,7 @@ static int _gdsc_disable(struct gdsc *sc)
>                  */
>                 udelay(1);
>
> -               ret = gdsc_poll_status(sc, GDSC_ON);
> +               ret = gdsc_poll_status(sc, GDSC_ON, TIMEOUT_US, 0);
>                 if (ret)
>                         return ret;
>         }
> @@ -608,3 +612,14 @@ int gdsc_gx_do_nothing_enable(struct generic_pm_domain *domain)
>         return 0;
>  }
>  EXPORT_SYMBOL_GPL(gdsc_gx_do_nothing_enable);
> +
> +int gdsc_wait_for_collapse(void *priv)
> +{
> +       struct gdsc *sc = priv;
> +       int ret;
> +
> +       ret = gdsc_poll_status(sc, GDSC_OFF, 500000, 5);
> +       WARN(ret, "%s status stuck at 'on'", sc->pd.name);
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(gdsc_wait_for_collapse);
> diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h
> index 981a12c..5395f69 100644
> --- a/drivers/clk/qcom/gdsc.h
> +++ b/drivers/clk/qcom/gdsc.h
> @@ -12,6 +12,7 @@
>  struct regmap;
>  struct regulator;
>  struct reset_controller_dev;
> +struct qcom_reset_map;
>
>  /**
>   * struct gdsc - Globally Distributed Switch Controller
> @@ -88,6 +89,7 @@ int gdsc_register(struct gdsc_desc *desc, struct reset_controller_dev *,
>                   struct regmap *);
>  void gdsc_unregister(struct gdsc_desc *desc);
>  int gdsc_gx_do_nothing_enable(struct generic_pm_domain *domain);
> +int gdsc_wait_for_collapse(void *priv);
>  #else
>  static inline int gdsc_register(struct gdsc_desc *desc,
>                                 struct reset_controller_dev *rcdev,
> @@ -97,5 +99,10 @@ static inline int gdsc_register(struct gdsc_desc *desc,
>  }
>
>  static inline void gdsc_unregister(struct gdsc_desc *desc) {};
> +
> +static int gdsc_wait_for_collapse(void *priv)
> +{
> +       return  -ENOSYS;
> +}
>  #endif /* CONFIG_QCOM_GDSC */
>  #endif /* __QCOM_GDSC_H__ */

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support
  2022-10-05  9:07 ` [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support Akhil P Oommen
@ 2022-12-07 15:46   ` Ulf Hansson
  2022-12-08 15:24     ` Akhil P Oommen
  0 siblings, 1 reply; 27+ messages in thread
From: Ulf Hansson @ 2022-12-07 15:46 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Andy Gross, Konrad Dybcio,
	Michael Turquette, Stephen Boyd, linux-clk, linux-kernel

On Wed, 5 Oct 2022 at 11:08, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
>
> Allow a consumer driver to poll for cx gdsc collapse through Reset
> framework.

Would you mind extending this commit message, to allow us to better
understand what part is really the consumer part.

I was expecting the consumer part to be the GPU (adreno) driver, but I
may have failed to understand correctly. It would be nice to see an
example of a typical sequence, where the reset is being
asserted/deasserted, from the consumer point of view. Would you mind
explaining this a bit more?

>
> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

Kind regards
Uffe

> ---
>
> (no changes since v3)
>
> Changes in v3:
> - Convert 'struct qcom_reset_ops cx_gdsc_reset' to 'static const' (Krzysztof)
>
> Changes in v2:
> - Minor update to use the updated custom reset ops implementation
>
>  drivers/clk/qcom/gpucc-sc7280.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/drivers/clk/qcom/gpucc-sc7280.c b/drivers/clk/qcom/gpucc-sc7280.c
> index 9a832f2..fece3f4 100644
> --- a/drivers/clk/qcom/gpucc-sc7280.c
> +++ b/drivers/clk/qcom/gpucc-sc7280.c
> @@ -433,12 +433,22 @@ static const struct regmap_config gpu_cc_sc7280_regmap_config = {
>         .fast_io = true,
>  };
>
> +static const struct qcom_reset_ops cx_gdsc_reset = {
> +       .reset = gdsc_wait_for_collapse,
> +};
> +
> +static const struct qcom_reset_map gpucc_sc7280_resets[] = {
> +       [GPU_CX_COLLAPSE] = { .ops = &cx_gdsc_reset, .priv = &cx_gdsc },
> +};
> +
>  static const struct qcom_cc_desc gpu_cc_sc7280_desc = {
>         .config = &gpu_cc_sc7280_regmap_config,
>         .clks = gpu_cc_sc7280_clocks,
>         .num_clks = ARRAY_SIZE(gpu_cc_sc7280_clocks),
>         .gdscs = gpu_cc_sc7180_gdscs,
>         .num_gdscs = ARRAY_SIZE(gpu_cc_sc7180_gdscs),
> +       .resets = gpucc_sc7280_resets,
> +       .num_resets = ARRAY_SIZE(gpucc_sc7280_resets),
>  };
>
>  static const struct of_device_id gpu_cc_sc7280_match_table[] = {
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-01 22:57 ` Bjorn Andersson
  2022-12-02  7:00   ` [Freedreno] " Akhil P Oommen
@ 2022-12-07 16:00   ` Ulf Hansson
  2022-12-07 16:54     ` Bjorn Andersson
  2022-12-08 15:31     ` Akhil P Oommen
  1 sibling, 2 replies; 27+ messages in thread
From: Ulf Hansson @ 2022-12-07 16:00 UTC (permalink / raw)
  To: Bjorn Andersson, Akhil P Oommen
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
>
> On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> >
>
> @Ulf, Akhil has a power-domain for a piece of hardware which may be
> voted active by multiple different subsystems (co-processors/execution
> contexts) in the system.
>
> As such, during the powering down sequence we don't wait for the
> power-domain to turn off. But in the event of an error, the recovery
> mechanism relies on waiting for the hardware to settle in a powered off
> state.
>
> The proposal here is to use the reset framework to wait for this state
> to be reached, before continuing with the recovery mechanism in the
> client driver.

I tried to review the series (see my other replies), but I am not sure
I fully understand the consumer part.

More exactly, when and who is going to pull the reset and at what point?

>
> Given our other discussions on quirky behavior, do you have any
> input/suggestions on this?
>
> > Some clients like adreno gpu driver would like to ensure that its gdsc
> > is collapsed at hardware during a gpu reset sequence. This is because it
> > has a votable gdsc which could be ON due to a vote from another subsystem
> > like tz, hyp etc or due to an internal hardware signal. To allow
> > this, gpucc driver can expose an interface to the client driver using
> > reset framework. Using this the client driver can trigger a polling within
> > the gdsc driver.
>
> @Akhil, this description is fairly generic. As we've reached the state
> where the hardware has settled and we return to the client, what
> prevents it from being powered up again?
>
> Or is it simply a question of it hitting the powered-off state, not
> necessarily staying there?

Okay, so it's indeed the GPU driver that is going to assert/de-assert
the reset at some point. Right?

That seems like a reasonable approach to me, even if it's a bit
unclear under what conditions that could happen.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-07 16:00   ` Ulf Hansson
@ 2022-12-07 16:54     ` Bjorn Andersson
  2022-12-08 13:40       ` Ulf Hansson
  2022-12-08 15:31     ` Akhil P Oommen
  1 sibling, 1 reply; 27+ messages in thread
From: Bjorn Andersson @ 2022-12-07 16:54 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Akhil P Oommen, freedreno, dri-devel, linux-arm-msm, Rob Clark,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
> On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
> >
> > On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> > >
> >
> > @Ulf, Akhil has a power-domain for a piece of hardware which may be
> > voted active by multiple different subsystems (co-processors/execution
> > contexts) in the system.
> >
> > As such, during the powering down sequence we don't wait for the
> > power-domain to turn off. But in the event of an error, the recovery
> > mechanism relies on waiting for the hardware to settle in a powered off
> > state.
> >
> > The proposal here is to use the reset framework to wait for this state
> > to be reached, before continuing with the recovery mechanism in the
> > client driver.
> 
> I tried to review the series (see my other replies), but I am not sure
> I fully understand the consumer part.
> 
> More exactly, when and who is going to pull the reset and at what point?
> 
> >
> > Given our other discussions on quirky behavior, do you have any
> > input/suggestions on this?
> >
> > > Some clients like adreno gpu driver would like to ensure that its gdsc
> > > is collapsed at hardware during a gpu reset sequence. This is because it
> > > has a votable gdsc which could be ON due to a vote from another subsystem
> > > like tz, hyp etc or due to an internal hardware signal. To allow
> > > this, gpucc driver can expose an interface to the client driver using
> > > reset framework. Using this the client driver can trigger a polling within
> > > the gdsc driver.
> >
> > @Akhil, this description is fairly generic. As we've reached the state
> > where the hardware has settled and we return to the client, what
> > prevents it from being powered up again?
> >
> > Or is it simply a question of it hitting the powered-off state, not
> > necessarily staying there?
> 
> Okay, so it's indeed the GPU driver that is going to assert/de-assert
> the reset at some point. Right?
> 
> That seems like a reasonable approach to me, even if it's a bit
> unclear under what conditions that could happen.
> 

Generally the disable-path of the power-domain does not check that the
power-domain is actually turned off, because the status might indicate
that the hardware is voting for the power-domain to be on.

As part of the recovery of the GPU after some fatal fault, the GPU
driver does something which will cause the hardware votes for the
power-domain to be let go, and then the driver does pm_runtime_put().

But in this case the GPU driver wants to ensure that the power-domain is
actually powered down, before it does pm_runtime_get() again. To ensure
that the hardware lost its state...

The proposal here is to use a reset to reach into the power-domain
provider and wait for the hardware to be turned off, before the GPU
driver attempts turning the power-domain on again.


In other words, there is no reset. This is a hack to make a normally
asynchronous pd.power_off() to be synchronous in this particular case.

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-07 16:54     ` Bjorn Andersson
@ 2022-12-08 13:40       ` Ulf Hansson
  2022-12-08 14:45         ` Akhil P Oommen
  2022-12-08 21:06         ` Bjorn Andersson
  0 siblings, 2 replies; 27+ messages in thread
From: Ulf Hansson @ 2022-12-08 13:40 UTC (permalink / raw)
  To: Akhil P Oommen, Bjorn Andersson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson <andersson@kernel.org> wrote:
>
> On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
> > On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
> > >
> > > On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> > > >
> > >
> > > @Ulf, Akhil has a power-domain for a piece of hardware which may be
> > > voted active by multiple different subsystems (co-processors/execution
> > > contexts) in the system.
> > >
> > > As such, during the powering down sequence we don't wait for the
> > > power-domain to turn off. But in the event of an error, the recovery
> > > mechanism relies on waiting for the hardware to settle in a powered off
> > > state.
> > >
> > > The proposal here is to use the reset framework to wait for this state
> > > to be reached, before continuing with the recovery mechanism in the
> > > client driver.
> >
> > I tried to review the series (see my other replies), but I am not sure
> > I fully understand the consumer part.
> >
> > More exactly, when and who is going to pull the reset and at what point?
> >
> > >
> > > Given our other discussions on quirky behavior, do you have any
> > > input/suggestions on this?
> > >
> > > > Some clients like adreno gpu driver would like to ensure that its gdsc
> > > > is collapsed at hardware during a gpu reset sequence. This is because it
> > > > has a votable gdsc which could be ON due to a vote from another subsystem
> > > > like tz, hyp etc or due to an internal hardware signal. To allow
> > > > this, gpucc driver can expose an interface to the client driver using
> > > > reset framework. Using this the client driver can trigger a polling within
> > > > the gdsc driver.
> > >
> > > @Akhil, this description is fairly generic. As we've reached the state
> > > where the hardware has settled and we return to the client, what
> > > prevents it from being powered up again?
> > >
> > > Or is it simply a question of it hitting the powered-off state, not
> > > necessarily staying there?
> >
> > Okay, so it's indeed the GPU driver that is going to assert/de-assert
> > the reset at some point. Right?
> >
> > That seems like a reasonable approach to me, even if it's a bit
> > unclear under what conditions that could happen.
> >
>
> Generally the disable-path of the power-domain does not check that the
> power-domain is actually turned off, because the status might indicate
> that the hardware is voting for the power-domain to be on.

Is there a good reason why the HW needs to vote too, when the GPU
driver is already in control?

Or perhaps that depends on the running use case?

>
> As part of the recovery of the GPU after some fatal fault, the GPU
> driver does something which will cause the hardware votes for the
> power-domain to be let go, and then the driver does pm_runtime_put().

Okay. That "something", sounds like a device specific setting for the
corresponding gdsc, right?

So somehow the GPU driver needs to manage that setting, right?

>
> But in this case the GPU driver wants to ensure that the power-domain is
> actually powered down, before it does pm_runtime_get() again. To ensure
> that the hardware lost its state...

I see.

>
> The proposal here is to use a reset to reach into the power-domain
> provider and wait for the hardware to be turned off, before the GPU
> driver attempts turning the power-domain on again.
>
>
> In other words, there is no reset. This is a hack to make a normally
> asynchronous pd.power_off() to be synchronous in this particular case.

Alright, assuming I understood your clarifications above correctly
(thanks!), I think I have got a much better picture now.

Rather than abusing the reset interface, I think we should manage this
through the genpd's power on/off notifiers (GENPD_NOTIFY_OFF). The GPU
driver should register its corresponding device for them
(dev_pm_genpd_add_notifier()).

The trick however, is to make the behaviour of the power-domain for
the gdsc (the genpd->power_off() callback) conditional on whether the
HW is configured to vote or not. If the HW can vote, it should not
poll for the state - and vice versa when the HW can't vote.

Would this work?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-08 13:40       ` Ulf Hansson
@ 2022-12-08 14:45         ` Akhil P Oommen
  2022-12-08 21:06         ` Bjorn Andersson
  1 sibling, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-08 14:45 UTC (permalink / raw)
  To: Ulf Hansson, Bjorn Andersson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On 12/8/2022 7:10 PM, Ulf Hansson wrote:
> On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson <andersson@kernel.org> wrote:
>> On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
>>> On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
>>>> On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
>>>> @Ulf, Akhil has a power-domain for a piece of hardware which may be
>>>> voted active by multiple different subsystems (co-processors/execution
>>>> contexts) in the system.
>>>>
>>>> As such, during the powering down sequence we don't wait for the
>>>> power-domain to turn off. But in the event of an error, the recovery
>>>> mechanism relies on waiting for the hardware to settle in a powered off
>>>> state.
>>>>
>>>> The proposal here is to use the reset framework to wait for this state
>>>> to be reached, before continuing with the recovery mechanism in the
>>>> client driver.
>>> I tried to review the series (see my other replies), but I am not sure
>>> I fully understand the consumer part.
>>>
>>> More exactly, when and who is going to pull the reset and at what point?
>>>
>>>> Given our other discussions on quirky behavior, do you have any
>>>> input/suggestions on this?
>>>>
>>>>> Some clients like adreno gpu driver would like to ensure that its gdsc
>>>>> is collapsed at hardware during a gpu reset sequence. This is because it
>>>>> has a votable gdsc which could be ON due to a vote from another subsystem
>>>>> like tz, hyp etc or due to an internal hardware signal. To allow
>>>>> this, gpucc driver can expose an interface to the client driver using
>>>>> reset framework. Using this the client driver can trigger a polling within
>>>>> the gdsc driver.
>>>> @Akhil, this description is fairly generic. As we've reached the state
>>>> where the hardware has settled and we return to the client, what
>>>> prevents it from being powered up again?
>>>>
>>>> Or is it simply a question of it hitting the powered-off state, not
>>>> necessarily staying there?
>>> Okay, so it's indeed the GPU driver that is going to assert/de-assert
>>> the reset at some point. Right?
>>>
>>> That seems like a reasonable approach to me, even if it's a bit
>>> unclear under what conditions that could happen.
>>>
>> Generally the disable-path of the power-domain does not check that the
>> power-domain is actually turned off, because the status might indicate
>> that the hardware is voting for the power-domain to be on.
> Is there a good reason why the HW needs to vote too, when the GPU
> driver is already in control?
>
> Or perhaps that depends on the running use case?
This power domain can be voted to be ON from other subsystems outside of linux kernel like secure os, hypervisor etc through separate vote registers. So it is not completely under the control of linux clk driver. Linux clk driver can only vote it to be kept ON, check current status etc, but cannot force it to be OFF. I believe this is why it is a votable gdsc in linux-clk driver.

Just a general clarification. GPU has mainly 2 power domains: (1) CX which is shared by GPU and its SMMU, (2) GX which is GPU specific and managed mostly by a power management core within GPU. This patch series is to allow gpu driver to ensure that CX gdsc has collapsed which in turn will reset GPU's internal state.
>
>> As part of the recovery of the GPU after some fatal fault, the GPU
>> driver does something which will cause the hardware votes for the
>> power-domain to be let go, and then the driver does pm_runtime_put().
> Okay. That "something", sounds like a device specific setting for the
> corresponding gdsc, right?
>
> So somehow the GPU driver needs to manage that setting, right?
Clarified about this above.
>
>> But in this case the GPU driver wants to ensure that the power-domain is
>> actually powered down, before it does pm_runtime_get() again. To ensure
>> that the hardware lost its state...
> I see.
>
>> The proposal here is to use a reset to reach into the power-domain
>> provider and wait for the hardware to be turned off, before the GPU
>> driver attempts turning the power-domain on again.
>>
>>
>> In other words, there is no reset. This is a hack to make a normally
>> asynchronous pd.power_off() to be synchronous in this particular case.
Not really. Because other non-linux subsystems are involved here for CX gdsc, we need a way to poll the gdsc register to ensure that it has indeed collapsed before gpu driver continue with re-initialization of gpu. It is either this approach using 'reset' framework or plumbing a new path from gpu driver to gpucc-gdsc driver to poll the collapse status. I went with the 'reset' approach as per the consensus here: https://patchwork.freedesktop.org/patch/493143/

-Akhil.
> Alright, assuming I understood your clarifications above correctly
> (thanks!), I think I have got a much better picture now.
>
> Rather than abusing the reset interface, I think we should manage this
> through the genpd's power on/off notifiers (GENPD_NOTIFY_OFF). The GPU
> driver should register its corresponding device for them
> (dev_pm_genpd_add_notifier()).
>
> The trick however, is to make the behaviour of the power-domain for
> the gdsc (the genpd->power_off() callback) conditional on whether the
> HW is configured to vote or not. If the HW can vote, it should not
> poll for the state - and vice versa when the HW can't vote.
>
> Would this work?
>
> Kind regards
> Uffe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/6] clk: qcom: gdsc: Add a reset op to poll gdsc collapse
  2022-12-07 15:45   ` Ulf Hansson
@ 2022-12-08 15:02     ` Akhil P Oommen
  2022-12-08 15:30       ` [Freedreno] " Akhil P Oommen
  0 siblings, 1 reply; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-08 15:02 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Andy Gross, Konrad Dybcio,
	Michael Turquette, Stephen Boyd, linux-clk, linux-kernel

On 12/7/2022 9:15 PM, Ulf Hansson wrote:
> On Wed, 5 Oct 2022 at 11:08, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
>> Add a reset op compatible function to poll for gdsc collapse. This is
>> required because:
>>   1. We don't wait for it to turn OFF at hardware for VOTABLE GDSCs.
>>   2. There is no way for client drivers (eg. gpu driver) to do
>>   put-with-wait for these gdscs which is required in some scenarios
>>   (eg. GPU recovery).
> What puzzles me a bit, who is the typical consumer of the reset.
>
> I looked at patch4 and tried to figure it out, but let's discuss that
> in that thread instead. Some more comments, see below.

https://patchwork.freedesktop.org/patch/498397/
gpu driver side changes are already merged in upstream. We call this interface during a GPU recovery which is supposed to be a rare event.
>
>> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> ---
>>
>> Changes in v7:
>> - Update commit message (Bjorn)
>>
>> Changes in v2:
>> - Minor update to function prototype
>>
>>  drivers/clk/qcom/gdsc.c | 23 +++++++++++++++++++----
>>  drivers/clk/qcom/gdsc.h |  7 +++++++
>>  2 files changed, 26 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
>> index 7cf5e13..ccef742 100644
>> --- a/drivers/clk/qcom/gdsc.c
>> +++ b/drivers/clk/qcom/gdsc.c
>> @@ -17,6 +17,7 @@
>>  #include <linux/reset-controller.h>
>>  #include <linux/slab.h>
>>  #include "gdsc.h"
>> +#include "reset.h"
>>
>>  #define PWR_ON_MASK            BIT(31)
>>  #define EN_REST_WAIT_MASK      GENMASK_ULL(23, 20)
>> @@ -116,7 +117,8 @@ static int gdsc_hwctrl(struct gdsc *sc, bool en)
>>         return regmap_update_bits(sc->regmap, sc->gdscr, HW_CONTROL_MASK, val);
>>  }
>>
>> -static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status)
>> +static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status,
>> +               s64 timeout_us, unsigned int interval_ms)
>>  {
>>         ktime_t start;
>>
>> @@ -124,7 +126,9 @@ static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status)
>>         do {
>>                 if (gdsc_check_status(sc, status))
>>                         return 0;
>> -       } while (ktime_us_delta(ktime_get(), start) < TIMEOUT_US);
>> +               if (interval_ms)
>> +                       msleep(interval_ms);
>> +       } while (ktime_us_delta(ktime_get(), start) < timeout_us);
> Rather than continue to open code this polling loop, would it not make
> sense to convert the code into using readx_poll_timeout() (or some of
> its friends).
I was going for a minimal code churn to get this mainlined easily. I like the idea, perhaps we can have a refactor patch if there is consensus.

-Akhil.
>
> Down the road, this leads to that the msleep() above should become
> usleep_range() instead, which seems more correct to me.
>
>>         if (gdsc_check_status(sc, status))
>>                 return 0;
>> @@ -189,7 +193,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
>>                 udelay(1);
>>         }
>>
>> -       ret = gdsc_poll_status(sc, status);
>> +       ret = gdsc_poll_status(sc, status, TIMEOUT_US, 0);
>>         WARN(ret, "%s status stuck at 'o%s'", sc->pd.name, status ? "ff" : "n");
>>
>>         if (!ret && status == GDSC_OFF && sc->rsupply) {
>> @@ -360,7 +364,7 @@ static int _gdsc_disable(struct gdsc *sc)
>>                  */
>>                 udelay(1);
>>
>> -               ret = gdsc_poll_status(sc, GDSC_ON);
>> +               ret = gdsc_poll_status(sc, GDSC_ON, TIMEOUT_US, 0);
>>                 if (ret)
>>                         return ret;
>>         }
>> @@ -608,3 +612,14 @@ int gdsc_gx_do_nothing_enable(struct generic_pm_domain *domain)
>>         return 0;
>>  }
>>  EXPORT_SYMBOL_GPL(gdsc_gx_do_nothing_enable);
>> +
>> +int gdsc_wait_for_collapse(void *priv)
>> +{
>> +       struct gdsc *sc = priv;
>> +       int ret;
>> +
>> +       ret = gdsc_poll_status(sc, GDSC_OFF, 500000, 5);
>> +       WARN(ret, "%s status stuck at 'on'", sc->pd.name);
>> +       return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gdsc_wait_for_collapse);
>> diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h
>> index 981a12c..5395f69 100644
>> --- a/drivers/clk/qcom/gdsc.h
>> +++ b/drivers/clk/qcom/gdsc.h
>> @@ -12,6 +12,7 @@
>>  struct regmap;
>>  struct regulator;
>>  struct reset_controller_dev;
>> +struct qcom_reset_map;
>>
>>  /**
>>   * struct gdsc - Globally Distributed Switch Controller
>> @@ -88,6 +89,7 @@ int gdsc_register(struct gdsc_desc *desc, struct reset_controller_dev *,
>>                   struct regmap *);
>>  void gdsc_unregister(struct gdsc_desc *desc);
>>  int gdsc_gx_do_nothing_enable(struct generic_pm_domain *domain);
>> +int gdsc_wait_for_collapse(void *priv);
>>  #else
>>  static inline int gdsc_register(struct gdsc_desc *desc,
>>                                 struct reset_controller_dev *rcdev,
>> @@ -97,5 +99,10 @@ static inline int gdsc_register(struct gdsc_desc *desc,
>>  }
>>
>>  static inline void gdsc_unregister(struct gdsc_desc *desc) {};
>> +
>> +static int gdsc_wait_for_collapse(void *priv)
>> +{
>> +       return  -ENOSYS;
>> +}
>>  #endif /* CONFIG_QCOM_GDSC */
>>  #endif /* __QCOM_GDSC_H__ */
> Kind regards
> Uffe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support
  2022-12-07 15:46   ` Ulf Hansson
@ 2022-12-08 15:24     ` Akhil P Oommen
  2022-12-08 21:09       ` Bjorn Andersson
  0 siblings, 1 reply; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-08 15:24 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Bjorn Andersson,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Andy Gross, Konrad Dybcio,
	Michael Turquette, Stephen Boyd, linux-clk, linux-kernel

On 12/7/2022 9:16 PM, Ulf Hansson wrote:
> On Wed, 5 Oct 2022 at 11:08, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
>> Allow a consumer driver to poll for cx gdsc collapse through Reset
>> framework.
> Would you mind extending this commit message, to allow us to better
> understand what part is really the consumer part.
Sure. I can do that.
>
> I was expecting the consumer part to be the GPU (adreno) driver, but I
> may have failed to understand correctly. It would be nice to see an
> example of a typical sequence, where the reset is being
> asserted/deasserted, from the consumer point of view. Would you mind
> explaining this a bit more?
https://elixir.bootlin.com/linux/v6.1-rc8/source/drivers/gpu/drm/msm/adreno/a6xx_gpu.c#L1309
You are correct. The consumer is adreno gpu driver. When there is a gpu fault, these sequences are followed:
1. drop pm_runtime_put() for gpu device which will drops its vote on 'cx' genpd. line: 1306
2. At this point, there could be vote from either smmu driver (smmu is under same power domain too) or from other subsystems (tz/hyp).
3. So we call into gdsc driver through reset interface to poll the gdsc register to ensure it collapsed at least once. Line: 1309
4. Then we turn ON gpu. line:1314.

-Akhil.
>
>> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Kind regards
> Uffe
>
>> ---
>>
>> (no changes since v3)
>>
>> Changes in v3:
>> - Convert 'struct qcom_reset_ops cx_gdsc_reset' to 'static const' (Krzysztof)
>>
>> Changes in v2:
>> - Minor update to use the updated custom reset ops implementation
>>
>>  drivers/clk/qcom/gpucc-sc7280.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/clk/qcom/gpucc-sc7280.c b/drivers/clk/qcom/gpucc-sc7280.c
>> index 9a832f2..fece3f4 100644
>> --- a/drivers/clk/qcom/gpucc-sc7280.c
>> +++ b/drivers/clk/qcom/gpucc-sc7280.c
>> @@ -433,12 +433,22 @@ static const struct regmap_config gpu_cc_sc7280_regmap_config = {
>>         .fast_io = true,
>>  };
>>
>> +static const struct qcom_reset_ops cx_gdsc_reset = {
>> +       .reset = gdsc_wait_for_collapse,
>> +};
>> +
>> +static const struct qcom_reset_map gpucc_sc7280_resets[] = {
>> +       [GPU_CX_COLLAPSE] = { .ops = &cx_gdsc_reset, .priv = &cx_gdsc },
>> +};
>> +
>>  static const struct qcom_cc_desc gpu_cc_sc7280_desc = {
>>         .config = &gpu_cc_sc7280_regmap_config,
>>         .clks = gpu_cc_sc7280_clocks,
>>         .num_clks = ARRAY_SIZE(gpu_cc_sc7280_clocks),
>>         .gdscs = gpu_cc_sc7180_gdscs,
>>         .num_gdscs = ARRAY_SIZE(gpu_cc_sc7180_gdscs),
>> +       .resets = gpucc_sc7280_resets,
>> +       .num_resets = ARRAY_SIZE(gpucc_sc7280_resets),
>>  };
>>
>>  static const struct of_device_id gpu_cc_sc7280_match_table[] = {
>> --
>> 2.7.4
>>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Freedreno] [PATCH v7 3/6] clk: qcom: gdsc: Add a reset op to poll gdsc collapse
  2022-12-08 15:02     ` Akhil P Oommen
@ 2022-12-08 15:30       ` Akhil P Oommen
  0 siblings, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-08 15:30 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-clk, Stephen Boyd, linux-arm-msm, Bjorn Andersson,
	Konrad Dybcio, Douglas Anderson, dri-devel, Stephen Boyd,
	krzysztof.kozlowski, Rob Clark, Andy Gross, Philipp Zabel,
	Dmitry Baryshkov, freedreno, Michael Turquette, linux-kernel

On 12/8/2022 8:32 PM, Akhil P Oommen wrote:
> On 12/7/2022 9:15 PM, Ulf Hansson wrote:
>> On Wed, 5 Oct 2022 at 11:08, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
>>> Add a reset op compatible function to poll for gdsc collapse. This is
>>> required because:
>>>   1. We don't wait for it to turn OFF at hardware for VOTABLE GDSCs.
>>>   2. There is no way for client drivers (eg. gpu driver) to do
>>>   put-with-wait for these gdscs which is required in some scenarios
>>>   (eg. GPU recovery).
>> What puzzles me a bit, who is the typical consumer of the reset.
>>
>> I looked at patch4 and tried to figure it out, but let's discuss that
>> in that thread instead. Some more comments, see below.
> https://patchwork.freedesktop.org/patch/498397/
> gpu driver side changes are already merged in upstream. We call this interface during a GPU recovery which is supposed to be a rare event.
>>> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
>>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>> ---
>>>
>>> Changes in v7:
>>> - Update commit message (Bjorn)
>>>
>>> Changes in v2:
>>> - Minor update to function prototype
>>>
>>>  drivers/clk/qcom/gdsc.c | 23 +++++++++++++++++++----
>>>  drivers/clk/qcom/gdsc.h |  7 +++++++
>>>  2 files changed, 26 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
>>> index 7cf5e13..ccef742 100644
>>> --- a/drivers/clk/qcom/gdsc.c
>>> +++ b/drivers/clk/qcom/gdsc.c
>>> @@ -17,6 +17,7 @@
>>>  #include <linux/reset-controller.h>
>>>  #include <linux/slab.h>
>>>  #include "gdsc.h"
>>> +#include "reset.h"
>>>
>>>  #define PWR_ON_MASK            BIT(31)
>>>  #define EN_REST_WAIT_MASK      GENMASK_ULL(23, 20)
>>> @@ -116,7 +117,8 @@ static int gdsc_hwctrl(struct gdsc *sc, bool en)
>>>         return regmap_update_bits(sc->regmap, sc->gdscr, HW_CONTROL_MASK, val);
>>>  }
>>>
>>> -static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status)
>>> +static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status,
>>> +               s64 timeout_us, unsigned int interval_ms)
>>>  {
>>>         ktime_t start;
>>>
>>> @@ -124,7 +126,9 @@ static int gdsc_poll_status(struct gdsc *sc, enum gdsc_status status)
>>>         do {
>>>                 if (gdsc_check_status(sc, status))
>>>                         return 0;
>>> -       } while (ktime_us_delta(ktime_get(), start) < TIMEOUT_US);
>>> +               if (interval_ms)
>>> +                       msleep(interval_ms);
>>> +       } while (ktime_us_delta(ktime_get(), start) < timeout_us);
>> Rather than continue to open code this polling loop, would it not make
>> sense to convert the code into using readx_poll_timeout() (or some of
>> its friends).
> I was going for a minimal code churn to get this mainlined easily. I like the idea, perhaps we can have a refactor patch if there is consensus.
>
> -Akhil.
>> Down the road, this leads to that the msleep() above should become
>> usleep_range() instead, which seems more correct to me.
Btw, we can optimize this a bit more using 'completion' to minimize CPU overhead. But that can wait as this is supposed to be a rarely used path. We don't expect gpu faults often. I would like to see a simpler implementation upstream first.

-Akhil.
>>
>>>         if (gdsc_check_status(sc, status))
>>>                 return 0;
>>> @@ -189,7 +193,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, enum gdsc_status status)
>>>                 udelay(1);
>>>         }
>>>
>>> -       ret = gdsc_poll_status(sc, status);
>>> +       ret = gdsc_poll_status(sc, status, TIMEOUT_US, 0);
>>>         WARN(ret, "%s status stuck at 'o%s'", sc->pd.name, status ? "ff" : "n");
>>>
>>>         if (!ret && status == GDSC_OFF && sc->rsupply) {
>>> @@ -360,7 +364,7 @@ static int _gdsc_disable(struct gdsc *sc)
>>>                  */
>>>                 udelay(1);
>>>
>>> -               ret = gdsc_poll_status(sc, GDSC_ON);
>>> +               ret = gdsc_poll_status(sc, GDSC_ON, TIMEOUT_US, 0);
>>>                 if (ret)
>>>                         return ret;
>>>         }
>>> @@ -608,3 +612,14 @@ int gdsc_gx_do_nothing_enable(struct generic_pm_domain *domain)
>>>         return 0;
>>>  }
>>>  EXPORT_SYMBOL_GPL(gdsc_gx_do_nothing_enable);
>>> +
>>> +int gdsc_wait_for_collapse(void *priv)
>>> +{
>>> +       struct gdsc *sc = priv;
>>> +       int ret;
>>> +
>>> +       ret = gdsc_poll_status(sc, GDSC_OFF, 500000, 5);
>>> +       WARN(ret, "%s status stuck at 'on'", sc->pd.name);
>>> +       return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(gdsc_wait_for_collapse);
>>> diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h
>>> index 981a12c..5395f69 100644
>>> --- a/drivers/clk/qcom/gdsc.h
>>> +++ b/drivers/clk/qcom/gdsc.h
>>> @@ -12,6 +12,7 @@
>>>  struct regmap;
>>>  struct regulator;
>>>  struct reset_controller_dev;
>>> +struct qcom_reset_map;
>>>
>>>  /**
>>>   * struct gdsc - Globally Distributed Switch Controller
>>> @@ -88,6 +89,7 @@ int gdsc_register(struct gdsc_desc *desc, struct reset_controller_dev *,
>>>                   struct regmap *);
>>>  void gdsc_unregister(struct gdsc_desc *desc);
>>>  int gdsc_gx_do_nothing_enable(struct generic_pm_domain *domain);
>>> +int gdsc_wait_for_collapse(void *priv);
>>>  #else
>>>  static inline int gdsc_register(struct gdsc_desc *desc,
>>>                                 struct reset_controller_dev *rcdev,
>>> @@ -97,5 +99,10 @@ static inline int gdsc_register(struct gdsc_desc *desc,
>>>  }
>>>
>>>  static inline void gdsc_unregister(struct gdsc_desc *desc) {};
>>> +
>>> +static int gdsc_wait_for_collapse(void *priv)
>>> +{
>>> +       return  -ENOSYS;
>>> +}
>>>  #endif /* CONFIG_QCOM_GDSC */
>>>  #endif /* __QCOM_GDSC_H__ */
>> Kind regards
>> Uffe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-07 16:00   ` Ulf Hansson
  2022-12-07 16:54     ` Bjorn Andersson
@ 2022-12-08 15:31     ` Akhil P Oommen
  1 sibling, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-08 15:31 UTC (permalink / raw)
  To: Ulf Hansson, Bjorn Andersson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On 12/7/2022 9:30 PM, Ulf Hansson wrote:
> On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
>> On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
>> @Ulf, Akhil has a power-domain for a piece of hardware which may be
>> voted active by multiple different subsystems (co-processors/execution
>> contexts) in the system.
>>
>> As such, during the powering down sequence we don't wait for the
>> power-domain to turn off. But in the event of an error, the recovery
>> mechanism relies on waiting for the hardware to settle in a powered off
>> state.
>>
>> The proposal here is to use the reset framework to wait for this state
>> to be reached, before continuing with the recovery mechanism in the
>> client driver.
> I tried to review the series (see my other replies), but I am not sure
> I fully understand the consumer part.
>
> More exactly, when and who is going to pull the reset and at what point?
Explained in the other patch.

-Akhil.
>
>> Given our other discussions on quirky behavior, do you have any
>> input/suggestions on this?
>>
>>> Some clients like adreno gpu driver would like to ensure that its gdsc
>>> is collapsed at hardware during a gpu reset sequence. This is because it
>>> has a votable gdsc which could be ON due to a vote from another subsystem
>>> like tz, hyp etc or due to an internal hardware signal. To allow
>>> this, gpucc driver can expose an interface to the client driver using
>>> reset framework. Using this the client driver can trigger a polling within
>>> the gdsc driver.
>> @Akhil, this description is fairly generic. As we've reached the state
>> where the hardware has settled and we return to the client, what
>> prevents it from being powered up again?
>>
>> Or is it simply a question of it hitting the powered-off state, not
>> necessarily staying there?
> Okay, so it's indeed the GPU driver that is going to assert/de-assert
> the reset at some point. Right?
>
> That seems like a reasonable approach to me, even if it's a bit
> unclear under what conditions that could happen.
>
> [...]
>
> Kind regards
> Uffe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-08 13:40       ` Ulf Hansson
  2022-12-08 14:45         ` Akhil P Oommen
@ 2022-12-08 21:06         ` Bjorn Andersson
  2022-12-09 17:36           ` Ulf Hansson
  1 sibling, 1 reply; 27+ messages in thread
From: Bjorn Andersson @ 2022-12-08 21:06 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Akhil P Oommen, freedreno, dri-devel, linux-arm-msm, Rob Clark,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On Thu, Dec 08, 2022 at 02:40:55PM +0100, Ulf Hansson wrote:
> On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson <andersson@kernel.org> wrote:
> >
> > On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
> > > On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
> > > >
> > > > On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> > > > >
> > > >
> > > > @Ulf, Akhil has a power-domain for a piece of hardware which may be
> > > > voted active by multiple different subsystems (co-processors/execution
> > > > contexts) in the system.
> > > >
> > > > As such, during the powering down sequence we don't wait for the
> > > > power-domain to turn off. But in the event of an error, the recovery
> > > > mechanism relies on waiting for the hardware to settle in a powered off
> > > > state.
> > > >
> > > > The proposal here is to use the reset framework to wait for this state
> > > > to be reached, before continuing with the recovery mechanism in the
> > > > client driver.
> > >
> > > I tried to review the series (see my other replies), but I am not sure
> > > I fully understand the consumer part.
> > >
> > > More exactly, when and who is going to pull the reset and at what point?
> > >
> > > >
> > > > Given our other discussions on quirky behavior, do you have any
> > > > input/suggestions on this?
> > > >
> > > > > Some clients like adreno gpu driver would like to ensure that its gdsc
> > > > > is collapsed at hardware during a gpu reset sequence. This is because it
> > > > > has a votable gdsc which could be ON due to a vote from another subsystem
> > > > > like tz, hyp etc or due to an internal hardware signal. To allow
> > > > > this, gpucc driver can expose an interface to the client driver using
> > > > > reset framework. Using this the client driver can trigger a polling within
> > > > > the gdsc driver.
> > > >
> > > > @Akhil, this description is fairly generic. As we've reached the state
> > > > where the hardware has settled and we return to the client, what
> > > > prevents it from being powered up again?
> > > >
> > > > Or is it simply a question of it hitting the powered-off state, not
> > > > necessarily staying there?
> > >
> > > Okay, so it's indeed the GPU driver that is going to assert/de-assert
> > > the reset at some point. Right?
> > >
> > > That seems like a reasonable approach to me, even if it's a bit
> > > unclear under what conditions that could happen.
> > >
> >
> > Generally the disable-path of the power-domain does not check that the
> > power-domain is actually turned off, because the status might indicate
> > that the hardware is voting for the power-domain to be on.
> 
> Is there a good reason why the HW needs to vote too, when the GPU
> driver is already in control?
> 
> Or perhaps that depends on the running use case?
> 
> >
> > As part of the recovery of the GPU after some fatal fault, the GPU
> > driver does something which will cause the hardware votes for the
> > power-domain to be let go, and then the driver does pm_runtime_put().
> 
> Okay. That "something", sounds like a device specific setting for the
> corresponding gdsc, right?
> 
> So somehow the GPU driver needs to manage that setting, right?
> 
> >
> > But in this case the GPU driver wants to ensure that the power-domain is
> > actually powered down, before it does pm_runtime_get() again. To ensure
> > that the hardware lost its state...
> 
> I see.
> 
> >
> > The proposal here is to use a reset to reach into the power-domain
> > provider and wait for the hardware to be turned off, before the GPU
> > driver attempts turning the power-domain on again.
> >
> >
> > In other words, there is no reset. This is a hack to make a normally
> > asynchronous pd.power_off() to be synchronous in this particular case.
> 
> Alright, assuming I understood your clarifications above correctly
> (thanks!), I think I have got a much better picture now.
> 
> Rather than abusing the reset interface, I think we should manage this
> through the genpd's power on/off notifiers (GENPD_NOTIFY_OFF). The GPU
> driver should register its corresponding device for them
> (dev_pm_genpd_add_notifier()).
> 
> The trick however, is to make the behaviour of the power-domain for
> the gdsc (the genpd->power_off() callback) conditional on whether the
> HW is configured to vote or not. If the HW can vote, it should not
> poll for the state - and vice versa when the HW can't vote.
> 

Per Akhil's description I misunderstood who the other voters are; but
either way it's not the same "HW configured" mechanism as the one we're
already discussing.


But if we based on similar means could control if the power_off() ops
should be blocking, waiting for the status indication to show that the
hardware is indeed powered down, I think this would meet the needs.

And GENPD_NOTIFY_OFF seems to provide the notification that it was
successful (i.e. happened within the timeout etc).

> Would this work?
> 

If we can control the behavior of the genpd, I think it would.

Thanks,
Bjorn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support
  2022-12-08 15:24     ` Akhil P Oommen
@ 2022-12-08 21:09       ` Bjorn Andersson
  2022-12-12 17:49         ` Akhil P Oommen
  0 siblings, 1 reply; 27+ messages in thread
From: Bjorn Andersson @ 2022-12-08 21:09 UTC (permalink / raw)
  To: Akhil P Oommen
  Cc: Ulf Hansson, freedreno, dri-devel, linux-arm-msm, Rob Clark,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Andy Gross, Konrad Dybcio,
	Michael Turquette, Stephen Boyd, linux-clk, linux-kernel

On Thu, Dec 08, 2022 at 08:54:39PM +0530, Akhil P Oommen wrote:
> On 12/7/2022 9:16 PM, Ulf Hansson wrote:
> > On Wed, 5 Oct 2022 at 11:08, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
> >> Allow a consumer driver to poll for cx gdsc collapse through Reset
> >> framework.
> > Would you mind extending this commit message, to allow us to better
> > understand what part is really the consumer part.
> Sure. I can do that.
> >
> > I was expecting the consumer part to be the GPU (adreno) driver, but I
> > may have failed to understand correctly. It would be nice to see an
> > example of a typical sequence, where the reset is being
> > asserted/deasserted, from the consumer point of view. Would you mind
> > explaining this a bit more?
> https://elixir.bootlin.com/linux/v6.1-rc8/source/drivers/gpu/drm/msm/adreno/a6xx_gpu.c#L1309
> You are correct. The consumer is adreno gpu driver. When there is a gpu fault, these sequences are followed:
> 1. drop pm_runtime_put() for gpu device which will drops its vote on 'cx' genpd. line: 1306
> 2. At this point, there could be vote from either smmu driver (smmu is under same power domain too) or from other subsystems (tz/hyp).

Can you confirm that this is happening completely independent of what
the kernel does?

> 3. So we call into gdsc driver through reset interface to poll the gdsc register to ensure it collapsed at least once. Line: 1309

I other words, if we engineered 1. such that it would wait in
gdsc_disable() until the condition for 3. is reached, that should work
for you? (Obviously depending on the ability for us to engineer this...)

Regards,
Bjorn

> 4. Then we turn ON gpu. line:1314.
> 
> -Akhil.
> >
> >> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
> >> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > Kind regards
> > Uffe
> >
> >> ---
> >>
> >> (no changes since v3)
> >>
> >> Changes in v3:
> >> - Convert 'struct qcom_reset_ops cx_gdsc_reset' to 'static const' (Krzysztof)
> >>
> >> Changes in v2:
> >> - Minor update to use the updated custom reset ops implementation
> >>
> >>  drivers/clk/qcom/gpucc-sc7280.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/drivers/clk/qcom/gpucc-sc7280.c b/drivers/clk/qcom/gpucc-sc7280.c
> >> index 9a832f2..fece3f4 100644
> >> --- a/drivers/clk/qcom/gpucc-sc7280.c
> >> +++ b/drivers/clk/qcom/gpucc-sc7280.c
> >> @@ -433,12 +433,22 @@ static const struct regmap_config gpu_cc_sc7280_regmap_config = {
> >>         .fast_io = true,
> >>  };
> >>
> >> +static const struct qcom_reset_ops cx_gdsc_reset = {
> >> +       .reset = gdsc_wait_for_collapse,
> >> +};
> >> +
> >> +static const struct qcom_reset_map gpucc_sc7280_resets[] = {
> >> +       [GPU_CX_COLLAPSE] = { .ops = &cx_gdsc_reset, .priv = &cx_gdsc },
> >> +};
> >> +
> >>  static const struct qcom_cc_desc gpu_cc_sc7280_desc = {
> >>         .config = &gpu_cc_sc7280_regmap_config,
> >>         .clks = gpu_cc_sc7280_clocks,
> >>         .num_clks = ARRAY_SIZE(gpu_cc_sc7280_clocks),
> >>         .gdscs = gpu_cc_sc7180_gdscs,
> >>         .num_gdscs = ARRAY_SIZE(gpu_cc_sc7180_gdscs),
> >> +       .resets = gpucc_sc7280_resets,
> >> +       .num_resets = ARRAY_SIZE(gpucc_sc7280_resets),
> >>  };
> >>
> >>  static const struct of_device_id gpu_cc_sc7280_match_table[] = {
> >> --
> >> 2.7.4
> >>
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-08 21:06         ` Bjorn Andersson
@ 2022-12-09 17:36           ` Ulf Hansson
  2022-12-12 15:39             ` Ulf Hansson
  0 siblings, 1 reply; 27+ messages in thread
From: Ulf Hansson @ 2022-12-09 17:36 UTC (permalink / raw)
  To: Akhil P Oommen, Bjorn Andersson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On Thu, 8 Dec 2022 at 22:06, Bjorn Andersson <andersson@kernel.org> wrote:
>
> On Thu, Dec 08, 2022 at 02:40:55PM +0100, Ulf Hansson wrote:
> > On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson <andersson@kernel.org> wrote:
> > >
> > > On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
> > > > On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
> > > > >
> > > > > On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> > > > > >
> > > > >
> > > > > @Ulf, Akhil has a power-domain for a piece of hardware which may be
> > > > > voted active by multiple different subsystems (co-processors/execution
> > > > > contexts) in the system.
> > > > >
> > > > > As such, during the powering down sequence we don't wait for the
> > > > > power-domain to turn off. But in the event of an error, the recovery
> > > > > mechanism relies on waiting for the hardware to settle in a powered off
> > > > > state.
> > > > >
> > > > > The proposal here is to use the reset framework to wait for this state
> > > > > to be reached, before continuing with the recovery mechanism in the
> > > > > client driver.
> > > >
> > > > I tried to review the series (see my other replies), but I am not sure
> > > > I fully understand the consumer part.
> > > >
> > > > More exactly, when and who is going to pull the reset and at what point?
> > > >
> > > > >
> > > > > Given our other discussions on quirky behavior, do you have any
> > > > > input/suggestions on this?
> > > > >
> > > > > > Some clients like adreno gpu driver would like to ensure that its gdsc
> > > > > > is collapsed at hardware during a gpu reset sequence. This is because it
> > > > > > has a votable gdsc which could be ON due to a vote from another subsystem
> > > > > > like tz, hyp etc or due to an internal hardware signal. To allow
> > > > > > this, gpucc driver can expose an interface to the client driver using
> > > > > > reset framework. Using this the client driver can trigger a polling within
> > > > > > the gdsc driver.
> > > > >
> > > > > @Akhil, this description is fairly generic. As we've reached the state
> > > > > where the hardware has settled and we return to the client, what
> > > > > prevents it from being powered up again?
> > > > >
> > > > > Or is it simply a question of it hitting the powered-off state, not
> > > > > necessarily staying there?
> > > >
> > > > Okay, so it's indeed the GPU driver that is going to assert/de-assert
> > > > the reset at some point. Right?
> > > >
> > > > That seems like a reasonable approach to me, even if it's a bit
> > > > unclear under what conditions that could happen.
> > > >
> > >
> > > Generally the disable-path of the power-domain does not check that the
> > > power-domain is actually turned off, because the status might indicate
> > > that the hardware is voting for the power-domain to be on.
> >
> > Is there a good reason why the HW needs to vote too, when the GPU
> > driver is already in control?
> >
> > Or perhaps that depends on the running use case?
> >
> > >
> > > As part of the recovery of the GPU after some fatal fault, the GPU
> > > driver does something which will cause the hardware votes for the
> > > power-domain to be let go, and then the driver does pm_runtime_put().
> >
> > Okay. That "something", sounds like a device specific setting for the
> > corresponding gdsc, right?
> >
> > So somehow the GPU driver needs to manage that setting, right?
> >
> > >
> > > But in this case the GPU driver wants to ensure that the power-domain is
> > > actually powered down, before it does pm_runtime_get() again. To ensure
> > > that the hardware lost its state...
> >
> > I see.
> >
> > >
> > > The proposal here is to use a reset to reach into the power-domain
> > > provider and wait for the hardware to be turned off, before the GPU
> > > driver attempts turning the power-domain on again.
> > >
> > >
> > > In other words, there is no reset. This is a hack to make a normally
> > > asynchronous pd.power_off() to be synchronous in this particular case.
> >
> > Alright, assuming I understood your clarifications above correctly
> > (thanks!), I think I have got a much better picture now.
> >
> > Rather than abusing the reset interface, I think we should manage this
> > through the genpd's power on/off notifiers (GENPD_NOTIFY_OFF). The GPU
> > driver should register its corresponding device for them
> > (dev_pm_genpd_add_notifier()).
> >
> > The trick however, is to make the behaviour of the power-domain for
> > the gdsc (the genpd->power_off() callback) conditional on whether the
> > HW is configured to vote or not. If the HW can vote, it should not
> > poll for the state - and vice versa when the HW can't vote.
> >
>
> Per Akhil's description I misunderstood who the other voters are; but
> either way it's not the same "HW configured" mechanism as the one we're
> already discussing.

Okay, so this is another thing then.

>
>
> But if we based on similar means could control if the power_off() ops
> should be blocking, waiting for the status indication to show that the
> hardware is indeed powered down, I think this would meet the needs.

Right.

>
> And GENPD_NOTIFY_OFF seems to provide the notification that it was
> successful (i.e. happened within the timeout etc).
>
> > Would this work?
> >
>
> If we can control the behavior of the genpd, I think it would.

Okay, it seems like we need a new dev_pm_genpd_* interface that
consumers can call to instruct the genpd provider, that its
->power_off() callback needs to temporarily switch to become
synchronous.

I guess this could be useful for other similar cases too, where the
corresponding PM domain isn't actually being powered off, but rather
just voted for to become powered off, thus relying on the HW to do the
aggregation.

In any case, I am still a bit skeptical of the reset approach, as is
being suggested in the $subject series. Even if it's rather nice and
clean (but somewhat abusing the interface), it looks like there will
be synchronization problems between the calls to the
pm_runtime_put_sync() and reset_control_reset() in the GPU driver. The
"reset" may actually already have happened when the call to
reset_control_reset() is done, so we may fail to detect the power
collapse, right!?

Let me cook a patch for the new genpd interface that I have in mind,
then we can see how that plays out together with the other parts. I
will post it on Monday!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-09 17:36           ` Ulf Hansson
@ 2022-12-12 15:39             ` Ulf Hansson
  2022-12-12 17:43               ` Akhil P Oommen
  2022-12-27 18:24               ` Bjorn Andersson
  0 siblings, 2 replies; 27+ messages in thread
From: Ulf Hansson @ 2022-12-12 15:39 UTC (permalink / raw)
  To: Akhil P Oommen, Bjorn Andersson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On Fri, 9 Dec 2022 at 18:36, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Thu, 8 Dec 2022 at 22:06, Bjorn Andersson <andersson@kernel.org> wrote:
> >
> > On Thu, Dec 08, 2022 at 02:40:55PM +0100, Ulf Hansson wrote:
> > > On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson <andersson@kernel.org> wrote:
> > > >
> > > > On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
> > > > > On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
> > > > > >
> > > > > > On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> > > > > > >
> > > > > >
> > > > > > @Ulf, Akhil has a power-domain for a piece of hardware which may be
> > > > > > voted active by multiple different subsystems (co-processors/execution
> > > > > > contexts) in the system.
> > > > > >
> > > > > > As such, during the powering down sequence we don't wait for the
> > > > > > power-domain to turn off. But in the event of an error, the recovery
> > > > > > mechanism relies on waiting for the hardware to settle in a powered off
> > > > > > state.
> > > > > >
> > > > > > The proposal here is to use the reset framework to wait for this state
> > > > > > to be reached, before continuing with the recovery mechanism in the
> > > > > > client driver.
> > > > >
> > > > > I tried to review the series (see my other replies), but I am not sure
> > > > > I fully understand the consumer part.
> > > > >
> > > > > More exactly, when and who is going to pull the reset and at what point?
> > > > >
> > > > > >
> > > > > > Given our other discussions on quirky behavior, do you have any
> > > > > > input/suggestions on this?
> > > > > >
> > > > > > > Some clients like adreno gpu driver would like to ensure that its gdsc
> > > > > > > is collapsed at hardware during a gpu reset sequence. This is because it
> > > > > > > has a votable gdsc which could be ON due to a vote from another subsystem
> > > > > > > like tz, hyp etc or due to an internal hardware signal. To allow
> > > > > > > this, gpucc driver can expose an interface to the client driver using
> > > > > > > reset framework. Using this the client driver can trigger a polling within
> > > > > > > the gdsc driver.
> > > > > >
> > > > > > @Akhil, this description is fairly generic. As we've reached the state
> > > > > > where the hardware has settled and we return to the client, what
> > > > > > prevents it from being powered up again?
> > > > > >
> > > > > > Or is it simply a question of it hitting the powered-off state, not
> > > > > > necessarily staying there?
> > > > >
> > > > > Okay, so it's indeed the GPU driver that is going to assert/de-assert
> > > > > the reset at some point. Right?
> > > > >
> > > > > That seems like a reasonable approach to me, even if it's a bit
> > > > > unclear under what conditions that could happen.
> > > > >
> > > >
> > > > Generally the disable-path of the power-domain does not check that the
> > > > power-domain is actually turned off, because the status might indicate
> > > > that the hardware is voting for the power-domain to be on.
> > >
> > > Is there a good reason why the HW needs to vote too, when the GPU
> > > driver is already in control?
> > >
> > > Or perhaps that depends on the running use case?
> > >
> > > >
> > > > As part of the recovery of the GPU after some fatal fault, the GPU
> > > > driver does something which will cause the hardware votes for the
> > > > power-domain to be let go, and then the driver does pm_runtime_put().
> > >
> > > Okay. That "something", sounds like a device specific setting for the
> > > corresponding gdsc, right?
> > >
> > > So somehow the GPU driver needs to manage that setting, right?
> > >
> > > >
> > > > But in this case the GPU driver wants to ensure that the power-domain is
> > > > actually powered down, before it does pm_runtime_get() again. To ensure
> > > > that the hardware lost its state...
> > >
> > > I see.
> > >
> > > >
> > > > The proposal here is to use a reset to reach into the power-domain
> > > > provider and wait for the hardware to be turned off, before the GPU
> > > > driver attempts turning the power-domain on again.
> > > >
> > > >
> > > > In other words, there is no reset. This is a hack to make a normally
> > > > asynchronous pd.power_off() to be synchronous in this particular case.
> > >
> > > Alright, assuming I understood your clarifications above correctly
> > > (thanks!), I think I have got a much better picture now.
> > >
> > > Rather than abusing the reset interface, I think we should manage this
> > > through the genpd's power on/off notifiers (GENPD_NOTIFY_OFF). The GPU
> > > driver should register its corresponding device for them
> > > (dev_pm_genpd_add_notifier()).
> > >
> > > The trick however, is to make the behaviour of the power-domain for
> > > the gdsc (the genpd->power_off() callback) conditional on whether the
> > > HW is configured to vote or not. If the HW can vote, it should not
> > > poll for the state - and vice versa when the HW can't vote.
> > >
> >
> > Per Akhil's description I misunderstood who the other voters are; but
> > either way it's not the same "HW configured" mechanism as the one we're
> > already discussing.
>
> Okay, so this is another thing then.
>
> >
> >
> > But if we based on similar means could control if the power_off() ops
> > should be blocking, waiting for the status indication to show that the
> > hardware is indeed powered down, I think this would meet the needs.
>
> Right.
>
> >
> > And GENPD_NOTIFY_OFF seems to provide the notification that it was
> > successful (i.e. happened within the timeout etc).
> >
> > > Would this work?
> > >
> >
> > If we can control the behavior of the genpd, I think it would.
>
> Okay, it seems like we need a new dev_pm_genpd_* interface that
> consumers can call to instruct the genpd provider, that its
> ->power_off() callback needs to temporarily switch to become
> synchronous.
>
> I guess this could be useful for other similar cases too, where the
> corresponding PM domain isn't actually being powered off, but rather
> just voted for to become powered off, thus relying on the HW to do the
> aggregation.
>
> In any case, I am still a bit skeptical of the reset approach, as is
> being suggested in the $subject series. Even if it's rather nice and
> clean (but somewhat abusing the interface), it looks like there will
> be synchronization problems between the calls to the
> pm_runtime_put_sync() and reset_control_reset() in the GPU driver. The
> "reset" may actually already have happened when the call to
> reset_control_reset() is done, so we may fail to detect the power
> collapse, right!?
>
> Let me cook a patch for the new genpd interface that I have in mind,
> then we can see how that plays out together with the other parts. I
> will post it on Monday!

Below is the genpd patch that I had in mind.

As I stated above, the GPU driver would need to register for genpd's
power on/off notificers (GENPD_NOTIFY_OFF). Then it should call the
new dev_pm_genpd_synced_poweroff() and finally pm_runtime_put().
Moreover, when the GPU driver receives the GENPD_NOTIFY_OFF
notification, it should probably just kick a completion variable,
allowing the path that calls pm_runtime_put() to wait for the
notification to arrive.

On the genpd provider side, the ->power_off() callback should be
updated to check the new genpd->synced_poweroff variable, to indicate
whether it should poll for power collapse or not.

I think this should work, but if you still prefer to use the "reset"
approach, that's entirely up to you to decide.

Kind regards
Uffe

-----

From: Ulf Hansson <ulf.hansson@linaro.org>
Date: Mon, 12 Dec 2022 16:08:05 +0100
Subject: [PATCH] PM: domains: Allow a genpd consumer to require a synced power
 off

TODO: Write commit message

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 drivers/base/power/domain.c | 22 ++++++++++++++++++++++
 include/linux/pm_domain.h   |  1 +
 2 files changed, 23 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index b46aa490b4cd..3402b2ea7f61 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -494,6 +494,27 @@ void dev_pm_genpd_set_next_wakeup(struct device
*dev, ktime_t next)
 }
 EXPORT_SYMBOL_GPL(dev_pm_genpd_set_next_wakeup);

+/**
+ * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
+ *
+ * @dev: Device to handle
+ *
+ * TODO: Add description
+ */
+void dev_pm_genpd_synced_poweroff(struct device *dev)
+{
+       struct generic_pm_domain *genpd;
+
+       genpd = dev_to_genpd_safe(dev);
+       if (!genpd)
+               return;
+
+       genpd_lock(genpd);
+               genpd->synced_poweroff = true;
+       genpd_unlock(genpd);
+}
+EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
+
 static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
 {
        unsigned int state_idx = genpd->state_idx;
@@ -588,6 +609,7 @@ static int _genpd_power_off(struct
generic_pm_domain *genpd, bool timed)
 out:
        raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_OFF,
                                NULL);
+       genpd->synced_poweroff = false;
        return 0;
 busy:
        raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_ON, NULL);
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index ebc351698090..09c6c67a4896 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -134,6 +134,7 @@ struct generic_pm_domain {
        unsigned int prepared_count;    /* Suspend counter of prepared
devices */
        unsigned int performance_state; /* Aggregated max performance state */
        cpumask_var_t cpus;             /* A cpumask of the attached CPUs */
+       bool synced_poweroff;           /* A consumer needs a synced poweroff */
        int (*power_off)(struct generic_pm_domain *domain);
        int (*power_on)(struct generic_pm_domain *domain);
        struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-12 15:39             ` Ulf Hansson
@ 2022-12-12 17:43               ` Akhil P Oommen
  2022-12-27 18:24               ` Bjorn Andersson
  1 sibling, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-12 17:43 UTC (permalink / raw)
  To: Ulf Hansson, Bjorn Andersson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On 12/12/2022 9:09 PM, Ulf Hansson wrote:
> On Fri, 9 Dec 2022 at 18:36, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On Thu, 8 Dec 2022 at 22:06, Bjorn Andersson <andersson@kernel.org> wrote:
>>> On Thu, Dec 08, 2022 at 02:40:55PM +0100, Ulf Hansson wrote:
>>>> On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson <andersson@kernel.org> wrote:
>>>>> On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
>>>>>> On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
>>>>>>> On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
>>>>>>> @Ulf, Akhil has a power-domain for a piece of hardware which may be
>>>>>>> voted active by multiple different subsystems (co-processors/execution
>>>>>>> contexts) in the system.
>>>>>>>
>>>>>>> As such, during the powering down sequence we don't wait for the
>>>>>>> power-domain to turn off. But in the event of an error, the recovery
>>>>>>> mechanism relies on waiting for the hardware to settle in a powered off
>>>>>>> state.
>>>>>>>
>>>>>>> The proposal here is to use the reset framework to wait for this state
>>>>>>> to be reached, before continuing with the recovery mechanism in the
>>>>>>> client driver.
>>>>>> I tried to review the series (see my other replies), but I am not sure
>>>>>> I fully understand the consumer part.
>>>>>>
>>>>>> More exactly, when and who is going to pull the reset and at what point?
>>>>>>
>>>>>>> Given our other discussions on quirky behavior, do you have any
>>>>>>> input/suggestions on this?
>>>>>>>
>>>>>>>> Some clients like adreno gpu driver would like to ensure that its gdsc
>>>>>>>> is collapsed at hardware during a gpu reset sequence. This is because it
>>>>>>>> has a votable gdsc which could be ON due to a vote from another subsystem
>>>>>>>> like tz, hyp etc or due to an internal hardware signal. To allow
>>>>>>>> this, gpucc driver can expose an interface to the client driver using
>>>>>>>> reset framework. Using this the client driver can trigger a polling within
>>>>>>>> the gdsc driver.
>>>>>>> @Akhil, this description is fairly generic. As we've reached the state
>>>>>>> where the hardware has settled and we return to the client, what
>>>>>>> prevents it from being powered up again?
>>>>>>>
>>>>>>> Or is it simply a question of it hitting the powered-off state, not
>>>>>>> necessarily staying there?
>>>>>> Okay, so it's indeed the GPU driver that is going to assert/de-assert
>>>>>> the reset at some point. Right?
>>>>>>
>>>>>> That seems like a reasonable approach to me, even if it's a bit
>>>>>> unclear under what conditions that could happen.
>>>>>>
>>>>> Generally the disable-path of the power-domain does not check that the
>>>>> power-domain is actually turned off, because the status might indicate
>>>>> that the hardware is voting for the power-domain to be on.
>>>> Is there a good reason why the HW needs to vote too, when the GPU
>>>> driver is already in control?
>>>>
>>>> Or perhaps that depends on the running use case?
>>>>
>>>>> As part of the recovery of the GPU after some fatal fault, the GPU
>>>>> driver does something which will cause the hardware votes for the
>>>>> power-domain to be let go, and then the driver does pm_runtime_put().
>>>> Okay. That "something", sounds like a device specific setting for the
>>>> corresponding gdsc, right?
>>>>
>>>> So somehow the GPU driver needs to manage that setting, right?
>>>>
>>>>> But in this case the GPU driver wants to ensure that the power-domain is
>>>>> actually powered down, before it does pm_runtime_get() again. To ensure
>>>>> that the hardware lost its state...
>>>> I see.
>>>>
>>>>> The proposal here is to use a reset to reach into the power-domain
>>>>> provider and wait for the hardware to be turned off, before the GPU
>>>>> driver attempts turning the power-domain on again.
>>>>>
>>>>>
>>>>> In other words, there is no reset. This is a hack to make a normally
>>>>> asynchronous pd.power_off() to be synchronous in this particular case.
>>>> Alright, assuming I understood your clarifications above correctly
>>>> (thanks!), I think I have got a much better picture now.
>>>>
>>>> Rather than abusing the reset interface, I think we should manage this
>>>> through the genpd's power on/off notifiers (GENPD_NOTIFY_OFF). The GPU
>>>> driver should register its corresponding device for them
>>>> (dev_pm_genpd_add_notifier()).
>>>>
>>>> The trick however, is to make the behaviour of the power-domain for
>>>> the gdsc (the genpd->power_off() callback) conditional on whether the
>>>> HW is configured to vote or not. If the HW can vote, it should not
>>>> poll for the state - and vice versa when the HW can't vote.
>>>>
>>> Per Akhil's description I misunderstood who the other voters are; but
>>> either way it's not the same "HW configured" mechanism as the one we're
>>> already discussing.
>> Okay, so this is another thing then.
>>
>>>
>>> But if we based on similar means could control if the power_off() ops
>>> should be blocking, waiting for the status indication to show that the
>>> hardware is indeed powered down, I think this would meet the needs.
>> Right.
>>
>>> And GENPD_NOTIFY_OFF seems to provide the notification that it was
>>> successful (i.e. happened within the timeout etc).
>>>
>>>> Would this work?
>>>>
>>> If we can control the behavior of the genpd, I think it would.
>> Okay, it seems like we need a new dev_pm_genpd_* interface that
>> consumers can call to instruct the genpd provider, that its
>> ->power_off() callback needs to temporarily switch to become
>> synchronous.
>>
>> I guess this could be useful for other similar cases too, where the
>> corresponding PM domain isn't actually being powered off, but rather
>> just voted for to become powered off, thus relying on the HW to do the
>> aggregation.
>>
>> In any case, I am still a bit skeptical of the reset approach, as is
>> being suggested in the $subject series. Even if it's rather nice and
>> clean (but somewhat abusing the interface), it looks like there will
>> be synchronization problems between the calls to the
>> pm_runtime_put_sync() and reset_control_reset() in the GPU driver. The
>> "reset" may actually already have happened when the call to
>> reset_control_reset() is done, so we may fail to detect the power
>> collapse, right!?
>>
>> Let me cook a patch for the new genpd interface that I have in mind,
>> then we can see how that plays out together with the other parts. I
>> will post it on Monday!
> Below is the genpd patch that I had in mind.
>
> As I stated above, the GPU driver would need to register for genpd's
> power on/off notificers (GENPD_NOTIFY_OFF). Then it should call the
> new dev_pm_genpd_synced_poweroff() and finally pm_runtime_put().
> Moreover, when the GPU driver receives the GENPD_NOTIFY_OFF
> notification, it should probably just kick a completion variable,
> allowing the path that calls pm_runtime_put() to wait for the
> notification to arrive.
>
> On the genpd provider side, the ->power_off() callback should be
> updated to check the new genpd->synced_poweroff variable, to indicate
> whether it should poll for power collapse or not.
>
> I think this should work, but if you still prefer to use the "reset"
> approach, that's entirely up to you to decide.
>
> Kind regards
> Uffe
>
> -----
>
> From: Ulf Hansson <ulf.hansson@linaro.org>
> Date: Mon, 12 Dec 2022 16:08:05 +0100
> Subject: [PATCH] PM: domains: Allow a genpd consumer to require a synced power
>  off
>
> TODO: Write commit message
>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>  drivers/base/power/domain.c | 22 ++++++++++++++++++++++
>  include/linux/pm_domain.h   |  1 +
>  2 files changed, 23 insertions(+)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index b46aa490b4cd..3402b2ea7f61 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -494,6 +494,27 @@ void dev_pm_genpd_set_next_wakeup(struct device
> *dev, ktime_t next)
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_genpd_set_next_wakeup);
>
> +/**
> + * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
> + *
> + * @dev: Device to handle
> + *
> + * TODO: Add description
> + */
> +void dev_pm_genpd_synced_poweroff(struct device *dev)
> +{
> +       struct generic_pm_domain *genpd;
> +
> +       genpd = dev_to_genpd_safe(dev);
> +       if (!genpd)
> +               return;
> +
> +       genpd_lock(genpd);
> +               genpd->synced_poweroff = true;
> +       genpd_unlock(genpd);
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
> +
>  static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
>  {
>         unsigned int state_idx = genpd->state_idx;
> @@ -588,6 +609,7 @@ static int _genpd_power_off(struct
> generic_pm_domain *genpd, bool timed)
>  out:
>         raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_OFF,
>                                 NULL);
> +       genpd->synced_poweroff = false;
>         return 0;
>  busy:
>         raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_ON, NULL);
> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
> index ebc351698090..09c6c67a4896 100644
> --- a/include/linux/pm_domain.h
> +++ b/include/linux/pm_domain.h
> @@ -134,6 +134,7 @@ struct generic_pm_domain {
>         unsigned int prepared_count;    /* Suspend counter of prepared
> devices */
>         unsigned int performance_state; /* Aggregated max performance state */
>         cpumask_var_t cpus;             /* A cpumask of the attached CPUs */
> +       bool synced_poweroff;           /* A consumer needs a synced poweroff */
>         int (*power_off)(struct generic_pm_domain *domain);
>         int (*power_on)(struct generic_pm_domain *domain);
>         struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
Thanks a lot, Ulf. I will try to prototype rest of the changes on top this.

-Akhil.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support
  2022-12-08 21:09       ` Bjorn Andersson
@ 2022-12-12 17:49         ` Akhil P Oommen
  0 siblings, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-12 17:49 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Ulf Hansson, freedreno, dri-devel, linux-arm-msm, Rob Clark,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Andy Gross, Konrad Dybcio,
	Michael Turquette, Stephen Boyd, linux-clk, linux-kernel

On 12/9/2022 2:39 AM, Bjorn Andersson wrote:
> On Thu, Dec 08, 2022 at 08:54:39PM +0530, Akhil P Oommen wrote:
>> On 12/7/2022 9:16 PM, Ulf Hansson wrote:
>>> On Wed, 5 Oct 2022 at 11:08, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
>>>> Allow a consumer driver to poll for cx gdsc collapse through Reset
>>>> framework.
>>> Would you mind extending this commit message, to allow us to better
>>> understand what part is really the consumer part.
>> Sure. I can do that.
>>> I was expecting the consumer part to be the GPU (adreno) driver, but I
>>> may have failed to understand correctly. It would be nice to see an
>>> example of a typical sequence, where the reset is being
>>> asserted/deasserted, from the consumer point of view. Would you mind
>>> explaining this a bit more?
>> https://elixir.bootlin.com/linux/v6.1-rc8/source/drivers/gpu/drm/msm/adreno/a6xx_gpu.c#L1309
>> You are correct. The consumer is adreno gpu driver. When there is a gpu fault, these sequences are followed:
>> 1. drop pm_runtime_put() for gpu device which will drops its vote on 'cx' genpd. line: 1306
>> 2. At this point, there could be vote from either smmu driver (smmu is under same power domain too) or from other subsystems (tz/hyp).
> Can you confirm that this is happening completely independent of what
> the kernel does?
Yes, it is independent.
>
>> 3. So we call into gdsc driver through reset interface to poll the gdsc register to ensure it collapsed at least once. Line: 1309
> I other words, if we engineered 1. such that it would wait in
> gdsc_disable() until the condition for 3. is reached, that should work
> for you? (Obviously depending on the ability for us to engineer this...)
Yes, it will work.

-Akhil.
>
> Regards,
> Bjorn
>
>> 4. Then we turn ON gpu. line:1314.
>>
>> -Akhil.
>>>> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
>>>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>> Kind regards
>>> Uffe
>>>
>>>> ---
>>>>
>>>> (no changes since v3)
>>>>
>>>> Changes in v3:
>>>> - Convert 'struct qcom_reset_ops cx_gdsc_reset' to 'static const' (Krzysztof)
>>>>
>>>> Changes in v2:
>>>> - Minor update to use the updated custom reset ops implementation
>>>>
>>>>  drivers/clk/qcom/gpucc-sc7280.c | 10 ++++++++++
>>>>  1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/drivers/clk/qcom/gpucc-sc7280.c b/drivers/clk/qcom/gpucc-sc7280.c
>>>> index 9a832f2..fece3f4 100644
>>>> --- a/drivers/clk/qcom/gpucc-sc7280.c
>>>> +++ b/drivers/clk/qcom/gpucc-sc7280.c
>>>> @@ -433,12 +433,22 @@ static const struct regmap_config gpu_cc_sc7280_regmap_config = {
>>>>         .fast_io = true,
>>>>  };
>>>>
>>>> +static const struct qcom_reset_ops cx_gdsc_reset = {
>>>> +       .reset = gdsc_wait_for_collapse,
>>>> +};
>>>> +
>>>> +static const struct qcom_reset_map gpucc_sc7280_resets[] = {
>>>> +       [GPU_CX_COLLAPSE] = { .ops = &cx_gdsc_reset, .priv = &cx_gdsc },
>>>> +};
>>>> +
>>>>  static const struct qcom_cc_desc gpu_cc_sc7280_desc = {
>>>>         .config = &gpu_cc_sc7280_regmap_config,
>>>>         .clks = gpu_cc_sc7280_clocks,
>>>>         .num_clks = ARRAY_SIZE(gpu_cc_sc7280_clocks),
>>>>         .gdscs = gpu_cc_sc7180_gdscs,
>>>>         .num_gdscs = ARRAY_SIZE(gpu_cc_sc7180_gdscs),
>>>> +       .resets = gpucc_sc7280_resets,
>>>> +       .num_resets = ARRAY_SIZE(gpucc_sc7280_resets),
>>>>  };
>>>>
>>>>  static const struct of_device_id gpu_cc_sc7280_match_table[] = {
>>>> --
>>>> 2.7.4
>>>>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-12 15:39             ` Ulf Hansson
  2022-12-12 17:43               ` Akhil P Oommen
@ 2022-12-27 18:24               ` Bjorn Andersson
  2022-12-28  9:23                 ` Akhil P Oommen
  1 sibling, 1 reply; 27+ messages in thread
From: Bjorn Andersson @ 2022-12-27 18:24 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Akhil P Oommen, freedreno, dri-devel, linux-arm-msm, Rob Clark,
	Stephen Boyd, Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On Mon, Dec 12, 2022 at 04:39:09PM +0100, Ulf Hansson wrote:
> On Fri, 9 Dec 2022 at 18:36, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Thu, 8 Dec 2022 at 22:06, Bjorn Andersson <andersson@kernel.org> wrote:
> > >
> > > On Thu, Dec 08, 2022 at 02:40:55PM +0100, Ulf Hansson wrote:
> > > > On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson <andersson@kernel.org> wrote:
> > > > >
> > > > > On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
> > > > > > On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
> > > > > > >
> > > > > > > On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
> > > > > > > >
> > > > > > >
> > > > > > > @Ulf, Akhil has a power-domain for a piece of hardware which may be
> > > > > > > voted active by multiple different subsystems (co-processors/execution
> > > > > > > contexts) in the system.
> > > > > > >
> > > > > > > As such, during the powering down sequence we don't wait for the
> > > > > > > power-domain to turn off. But in the event of an error, the recovery
> > > > > > > mechanism relies on waiting for the hardware to settle in a powered off
> > > > > > > state.
> > > > > > >
> > > > > > > The proposal here is to use the reset framework to wait for this state
> > > > > > > to be reached, before continuing with the recovery mechanism in the
> > > > > > > client driver.
> > > > > >
> > > > > > I tried to review the series (see my other replies), but I am not sure
> > > > > > I fully understand the consumer part.
> > > > > >
> > > > > > More exactly, when and who is going to pull the reset and at what point?
> > > > > >
> > > > > > >
> > > > > > > Given our other discussions on quirky behavior, do you have any
> > > > > > > input/suggestions on this?
> > > > > > >
> > > > > > > > Some clients like adreno gpu driver would like to ensure that its gdsc
> > > > > > > > is collapsed at hardware during a gpu reset sequence. This is because it
> > > > > > > > has a votable gdsc which could be ON due to a vote from another subsystem
> > > > > > > > like tz, hyp etc or due to an internal hardware signal. To allow
> > > > > > > > this, gpucc driver can expose an interface to the client driver using
> > > > > > > > reset framework. Using this the client driver can trigger a polling within
> > > > > > > > the gdsc driver.
> > > > > > >
> > > > > > > @Akhil, this description is fairly generic. As we've reached the state
> > > > > > > where the hardware has settled and we return to the client, what
> > > > > > > prevents it from being powered up again?
> > > > > > >
> > > > > > > Or is it simply a question of it hitting the powered-off state, not
> > > > > > > necessarily staying there?
> > > > > >
> > > > > > Okay, so it's indeed the GPU driver that is going to assert/de-assert
> > > > > > the reset at some point. Right?
> > > > > >
> > > > > > That seems like a reasonable approach to me, even if it's a bit
> > > > > > unclear under what conditions that could happen.
> > > > > >
> > > > >
> > > > > Generally the disable-path of the power-domain does not check that the
> > > > > power-domain is actually turned off, because the status might indicate
> > > > > that the hardware is voting for the power-domain to be on.
> > > >
> > > > Is there a good reason why the HW needs to vote too, when the GPU
> > > > driver is already in control?
> > > >
> > > > Or perhaps that depends on the running use case?
> > > >
> > > > >
> > > > > As part of the recovery of the GPU after some fatal fault, the GPU
> > > > > driver does something which will cause the hardware votes for the
> > > > > power-domain to be let go, and then the driver does pm_runtime_put().
> > > >
> > > > Okay. That "something", sounds like a device specific setting for the
> > > > corresponding gdsc, right?
> > > >
> > > > So somehow the GPU driver needs to manage that setting, right?
> > > >
> > > > >
> > > > > But in this case the GPU driver wants to ensure that the power-domain is
> > > > > actually powered down, before it does pm_runtime_get() again. To ensure
> > > > > that the hardware lost its state...
> > > >
> > > > I see.
> > > >
> > > > >
> > > > > The proposal here is to use a reset to reach into the power-domain
> > > > > provider and wait for the hardware to be turned off, before the GPU
> > > > > driver attempts turning the power-domain on again.
> > > > >
> > > > >
> > > > > In other words, there is no reset. This is a hack to make a normally
> > > > > asynchronous pd.power_off() to be synchronous in this particular case.
> > > >
> > > > Alright, assuming I understood your clarifications above correctly
> > > > (thanks!), I think I have got a much better picture now.
> > > >
> > > > Rather than abusing the reset interface, I think we should manage this
> > > > through the genpd's power on/off notifiers (GENPD_NOTIFY_OFF). The GPU
> > > > driver should register its corresponding device for them
> > > > (dev_pm_genpd_add_notifier()).
> > > >
> > > > The trick however, is to make the behaviour of the power-domain for
> > > > the gdsc (the genpd->power_off() callback) conditional on whether the
> > > > HW is configured to vote or not. If the HW can vote, it should not
> > > > poll for the state - and vice versa when the HW can't vote.
> > > >
> > >
> > > Per Akhil's description I misunderstood who the other voters are; but
> > > either way it's not the same "HW configured" mechanism as the one we're
> > > already discussing.
> >
> > Okay, so this is another thing then.
> >
> > >
> > >
> > > But if we based on similar means could control if the power_off() ops
> > > should be blocking, waiting for the status indication to show that the
> > > hardware is indeed powered down, I think this would meet the needs.
> >
> > Right.
> >
> > >
> > > And GENPD_NOTIFY_OFF seems to provide the notification that it was
> > > successful (i.e. happened within the timeout etc).
> > >
> > > > Would this work?
> > > >
> > >
> > > If we can control the behavior of the genpd, I think it would.
> >
> > Okay, it seems like we need a new dev_pm_genpd_* interface that
> > consumers can call to instruct the genpd provider, that its
> > ->power_off() callback needs to temporarily switch to become
> > synchronous.
> >
> > I guess this could be useful for other similar cases too, where the
> > corresponding PM domain isn't actually being powered off, but rather
> > just voted for to become powered off, thus relying on the HW to do the
> > aggregation.
> >
> > In any case, I am still a bit skeptical of the reset approach, as is
> > being suggested in the $subject series. Even if it's rather nice and
> > clean (but somewhat abusing the interface), it looks like there will
> > be synchronization problems between the calls to the
> > pm_runtime_put_sync() and reset_control_reset() in the GPU driver. The
> > "reset" may actually already have happened when the call to
> > reset_control_reset() is done, so we may fail to detect the power
> > collapse, right!?
> >
> > Let me cook a patch for the new genpd interface that I have in mind,
> > then we can see how that plays out together with the other parts. I
> > will post it on Monday!
> 
> Below is the genpd patch that I had in mind.
> 
> As I stated above, the GPU driver would need to register for genpd's
> power on/off notificers (GENPD_NOTIFY_OFF). Then it should call the
> new dev_pm_genpd_synced_poweroff() and finally pm_runtime_put().
> Moreover, when the GPU driver receives the GENPD_NOTIFY_OFF
> notification, it should probably just kick a completion variable,
> allowing the path that calls pm_runtime_put() to wait for the
> notification to arrive.
> 
> On the genpd provider side, the ->power_off() callback should be
> updated to check the new genpd->synced_poweroff variable, to indicate
> whether it should poll for power collapse or not.
> 
> I think this should work, but if you still prefer to use the "reset"
> approach, that's entirely up to you to decide.
> 

I find this to be conceptually much cleaner. Thanks for the proposal!

Regards,
Bjorn

> Kind regards
> Uffe
> 
> -----
> 
> From: Ulf Hansson <ulf.hansson@linaro.org>
> Date: Mon, 12 Dec 2022 16:08:05 +0100
> Subject: [PATCH] PM: domains: Allow a genpd consumer to require a synced power
>  off
> 
> TODO: Write commit message
> 
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>  drivers/base/power/domain.c | 22 ++++++++++++++++++++++
>  include/linux/pm_domain.h   |  1 +
>  2 files changed, 23 insertions(+)
> 
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index b46aa490b4cd..3402b2ea7f61 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -494,6 +494,27 @@ void dev_pm_genpd_set_next_wakeup(struct device
> *dev, ktime_t next)
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_genpd_set_next_wakeup);
> 
> +/**
> + * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
> + *
> + * @dev: Device to handle
> + *
> + * TODO: Add description
> + */
> +void dev_pm_genpd_synced_poweroff(struct device *dev)
> +{
> +       struct generic_pm_domain *genpd;
> +
> +       genpd = dev_to_genpd_safe(dev);
> +       if (!genpd)
> +               return;
> +
> +       genpd_lock(genpd);
> +               genpd->synced_poweroff = true;
> +       genpd_unlock(genpd);
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
> +
>  static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
>  {
>         unsigned int state_idx = genpd->state_idx;
> @@ -588,6 +609,7 @@ static int _genpd_power_off(struct
> generic_pm_domain *genpd, bool timed)
>  out:
>         raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_OFF,
>                                 NULL);
> +       genpd->synced_poweroff = false;
>         return 0;
>  busy:
>         raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_ON, NULL);
> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
> index ebc351698090..09c6c67a4896 100644
> --- a/include/linux/pm_domain.h
> +++ b/include/linux/pm_domain.h
> @@ -134,6 +134,7 @@ struct generic_pm_domain {
>         unsigned int prepared_count;    /* Suspend counter of prepared
> devices */
>         unsigned int performance_state; /* Aggregated max performance state */
>         cpumask_var_t cpus;             /* A cpumask of the attached CPUs */
> +       bool synced_poweroff;           /* A consumer needs a synced poweroff */
>         int (*power_off)(struct generic_pm_domain *domain);
>         int (*power_on)(struct generic_pm_domain *domain);
>         struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface
  2022-12-27 18:24               ` Bjorn Andersson
@ 2022-12-28  9:23                 ` Akhil P Oommen
  0 siblings, 0 replies; 27+ messages in thread
From: Akhil P Oommen @ 2022-12-28  9:23 UTC (permalink / raw)
  To: Bjorn Andersson, Ulf Hansson
  Cc: freedreno, dri-devel, linux-arm-msm, Rob Clark, Stephen Boyd,
	Dmitry Baryshkov, Philipp Zabel, Douglas Anderson,
	krzysztof.kozlowski, Abhinav Kumar, Andy Gross, Daniel Vetter,
	David Airlie, Konrad Dybcio, Krzysztof Kozlowski,
	Michael Turquette, Rob Herring, Sean Paul, Stephen Boyd,
	devicetree, linux-clk, linux-kernel

On 12/27/2022 11:54 PM, Bjorn Andersson wrote:
> On Mon, Dec 12, 2022 at 04:39:09PM +0100, Ulf Hansson wrote:
>> On Fri, 9 Dec 2022 at 18:36, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>> On Thu, 8 Dec 2022 at 22:06, Bjorn Andersson <andersson@kernel.org> wrote:
>>>> On Thu, Dec 08, 2022 at 02:40:55PM +0100, Ulf Hansson wrote:
>>>>> On Wed, 7 Dec 2022 at 17:55, Bjorn Andersson <andersson@kernel.org> wrote:
>>>>>> On Wed, Dec 07, 2022 at 05:00:51PM +0100, Ulf Hansson wrote:
>>>>>>> On Thu, 1 Dec 2022 at 23:57, Bjorn Andersson <andersson@kernel.org> wrote:
>>>>>>>> On Wed, Oct 05, 2022 at 02:36:58PM +0530, Akhil P Oommen wrote:
>>>>>>>> @Ulf, Akhil has a power-domain for a piece of hardware which may be
>>>>>>>> voted active by multiple different subsystems (co-processors/execution
>>>>>>>> contexts) in the system.
>>>>>>>>
>>>>>>>> As such, during the powering down sequence we don't wait for the
>>>>>>>> power-domain to turn off. But in the event of an error, the recovery
>>>>>>>> mechanism relies on waiting for the hardware to settle in a powered off
>>>>>>>> state.
>>>>>>>>
>>>>>>>> The proposal here is to use the reset framework to wait for this state
>>>>>>>> to be reached, before continuing with the recovery mechanism in the
>>>>>>>> client driver.
>>>>>>> I tried to review the series (see my other replies), but I am not sure
>>>>>>> I fully understand the consumer part.
>>>>>>>
>>>>>>> More exactly, when and who is going to pull the reset and at what point?
>>>>>>>
>>>>>>>> Given our other discussions on quirky behavior, do you have any
>>>>>>>> input/suggestions on this?
>>>>>>>>
>>>>>>>>> Some clients like adreno gpu driver would like to ensure that its gdsc
>>>>>>>>> is collapsed at hardware during a gpu reset sequence. This is because it
>>>>>>>>> has a votable gdsc which could be ON due to a vote from another subsystem
>>>>>>>>> like tz, hyp etc or due to an internal hardware signal. To allow
>>>>>>>>> this, gpucc driver can expose an interface to the client driver using
>>>>>>>>> reset framework. Using this the client driver can trigger a polling within
>>>>>>>>> the gdsc driver.
>>>>>>>> @Akhil, this description is fairly generic. As we've reached the state
>>>>>>>> where the hardware has settled and we return to the client, what
>>>>>>>> prevents it from being powered up again?
>>>>>>>>
>>>>>>>> Or is it simply a question of it hitting the powered-off state, not
>>>>>>>> necessarily staying there?
>>>>>>> Okay, so it's indeed the GPU driver that is going to assert/de-assert
>>>>>>> the reset at some point. Right?
>>>>>>>
>>>>>>> That seems like a reasonable approach to me, even if it's a bit
>>>>>>> unclear under what conditions that could happen.
>>>>>>>
>>>>>> Generally the disable-path of the power-domain does not check that the
>>>>>> power-domain is actually turned off, because the status might indicate
>>>>>> that the hardware is voting for the power-domain to be on.
>>>>> Is there a good reason why the HW needs to vote too, when the GPU
>>>>> driver is already in control?
>>>>>
>>>>> Or perhaps that depends on the running use case?
>>>>>
>>>>>> As part of the recovery of the GPU after some fatal fault, the GPU
>>>>>> driver does something which will cause the hardware votes for the
>>>>>> power-domain to be let go, and then the driver does pm_runtime_put().
>>>>> Okay. That "something", sounds like a device specific setting for the
>>>>> corresponding gdsc, right?
>>>>>
>>>>> So somehow the GPU driver needs to manage that setting, right?
>>>>>
>>>>>> But in this case the GPU driver wants to ensure that the power-domain is
>>>>>> actually powered down, before it does pm_runtime_get() again. To ensure
>>>>>> that the hardware lost its state...
>>>>> I see.
>>>>>
>>>>>> The proposal here is to use a reset to reach into the power-domain
>>>>>> provider and wait for the hardware to be turned off, before the GPU
>>>>>> driver attempts turning the power-domain on again.
>>>>>>
>>>>>>
>>>>>> In other words, there is no reset. This is a hack to make a normally
>>>>>> asynchronous pd.power_off() to be synchronous in this particular case.
>>>>> Alright, assuming I understood your clarifications above correctly
>>>>> (thanks!), I think I have got a much better picture now.
>>>>>
>>>>> Rather than abusing the reset interface, I think we should manage this
>>>>> through the genpd's power on/off notifiers (GENPD_NOTIFY_OFF). The GPU
>>>>> driver should register its corresponding device for them
>>>>> (dev_pm_genpd_add_notifier()).
>>>>>
>>>>> The trick however, is to make the behaviour of the power-domain for
>>>>> the gdsc (the genpd->power_off() callback) conditional on whether the
>>>>> HW is configured to vote or not. If the HW can vote, it should not
>>>>> poll for the state - and vice versa when the HW can't vote.
>>>>>
>>>> Per Akhil's description I misunderstood who the other voters are; but
>>>> either way it's not the same "HW configured" mechanism as the one we're
>>>> already discussing.
>>> Okay, so this is another thing then.
>>>
>>>>
>>>> But if we based on similar means could control if the power_off() ops
>>>> should be blocking, waiting for the status indication to show that the
>>>> hardware is indeed powered down, I think this would meet the needs.
>>> Right.
>>>
>>>> And GENPD_NOTIFY_OFF seems to provide the notification that it was
>>>> successful (i.e. happened within the timeout etc).
>>>>
>>>>> Would this work?
>>>>>
>>>> If we can control the behavior of the genpd, I think it would.
>>> Okay, it seems like we need a new dev_pm_genpd_* interface that
>>> consumers can call to instruct the genpd provider, that its
>>> ->power_off() callback needs to temporarily switch to become
>>> synchronous.
>>>
>>> I guess this could be useful for other similar cases too, where the
>>> corresponding PM domain isn't actually being powered off, but rather
>>> just voted for to become powered off, thus relying on the HW to do the
>>> aggregation.
>>>
>>> In any case, I am still a bit skeptical of the reset approach, as is
>>> being suggested in the $subject series. Even if it's rather nice and
>>> clean (but somewhat abusing the interface), it looks like there will
>>> be synchronization problems between the calls to the
>>> pm_runtime_put_sync() and reset_control_reset() in the GPU driver. The
>>> "reset" may actually already have happened when the call to
>>> reset_control_reset() is done, so we may fail to detect the power
>>> collapse, right!?
>>>
>>> Let me cook a patch for the new genpd interface that I have in mind,
>>> then we can see how that plays out together with the other parts. I
>>> will post it on Monday!
>> Below is the genpd patch that I had in mind.
>>
>> As I stated above, the GPU driver would need to register for genpd's
>> power on/off notificers (GENPD_NOTIFY_OFF). Then it should call the
>> new dev_pm_genpd_synced_poweroff() and finally pm_runtime_put().
>> Moreover, when the GPU driver receives the GENPD_NOTIFY_OFF
>> notification, it should probably just kick a completion variable,
>> allowing the path that calls pm_runtime_put() to wait for the
>> notification to arrive.
>>
>> On the genpd provider side, the ->power_off() callback should be
>> updated to check the new genpd->synced_poweroff variable, to indicate
>> whether it should poll for power collapse or not.
>>
>> I think this should work, but if you still prefer to use the "reset"
>> approach, that's entirely up to you to decide.
>>
> I find this to be conceptually much cleaner. Thanks for the proposal!
>
> Regards,
> Bjorn
https://patchwork.freedesktop.org/series/111966/
Bjorn, this is the new series based on this proposal.

-Akhil.
>> Kind regards
>> Uffe
>>
>> -----
>>
>> From: Ulf Hansson <ulf.hansson@linaro.org>
>> Date: Mon, 12 Dec 2022 16:08:05 +0100
>> Subject: [PATCH] PM: domains: Allow a genpd consumer to require a synced power
>>  off
>>
>> TODO: Write commit message
>>
>> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
>> ---
>>  drivers/base/power/domain.c | 22 ++++++++++++++++++++++
>>  include/linux/pm_domain.h   |  1 +
>>  2 files changed, 23 insertions(+)
>>
>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>> index b46aa490b4cd..3402b2ea7f61 100644
>> --- a/drivers/base/power/domain.c
>> +++ b/drivers/base/power/domain.c
>> @@ -494,6 +494,27 @@ void dev_pm_genpd_set_next_wakeup(struct device
>> *dev, ktime_t next)
>>  }
>>  EXPORT_SYMBOL_GPL(dev_pm_genpd_set_next_wakeup);
>>
>> +/**
>> + * dev_pm_genpd_synced_poweroff - Next power off should be synchronous
>> + *
>> + * @dev: Device to handle
>> + *
>> + * TODO: Add description
>> + */
>> +void dev_pm_genpd_synced_poweroff(struct device *dev)
>> +{
>> +       struct generic_pm_domain *genpd;
>> +
>> +       genpd = dev_to_genpd_safe(dev);
>> +       if (!genpd)
>> +               return;
>> +
>> +       genpd_lock(genpd);
>> +               genpd->synced_poweroff = true;
>> +       genpd_unlock(genpd);
>> +}
>> +EXPORT_SYMBOL_GPL(dev_pm_genpd_synced_poweroff);
>> +
>>  static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
>>  {
>>         unsigned int state_idx = genpd->state_idx;
>> @@ -588,6 +609,7 @@ static int _genpd_power_off(struct
>> generic_pm_domain *genpd, bool timed)
>>  out:
>>         raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_OFF,
>>                                 NULL);
>> +       genpd->synced_poweroff = false;
>>         return 0;
>>  busy:
>>         raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_ON, NULL);
>> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
>> index ebc351698090..09c6c67a4896 100644
>> --- a/include/linux/pm_domain.h
>> +++ b/include/linux/pm_domain.h
>> @@ -134,6 +134,7 @@ struct generic_pm_domain {
>>         unsigned int prepared_count;    /* Suspend counter of prepared
>> devices */
>>         unsigned int performance_state; /* Aggregated max performance state */
>>         cpumask_var_t cpus;             /* A cpumask of the attached CPUs */
>> +       bool synced_poweroff;           /* A consumer needs a synced poweroff */
>>         int (*power_off)(struct generic_pm_domain *domain);
>>         int (*power_on)(struct generic_pm_domain *domain);
>>         struct raw_notifier_head power_notifiers; /* Power on/off notifiers */
>> -- 
>> 2.34.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-12-28  9:24 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-05  9:06 [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
2022-10-05  9:06 ` [PATCH v7 1/6] dt-bindings: clk: qcom: Support gpu cx gdsc reset Akhil P Oommen
2022-10-05  9:07 ` [PATCH v7 2/6] clk: qcom: Allow custom reset ops Akhil P Oommen
2022-10-05  9:07 ` [PATCH v7 3/6] clk: qcom: gdsc: Add a reset op to poll gdsc collapse Akhil P Oommen
2022-12-07 15:45   ` Ulf Hansson
2022-12-08 15:02     ` Akhil P Oommen
2022-12-08 15:30       ` [Freedreno] " Akhil P Oommen
2022-10-05  9:07 ` [PATCH v7 4/6] clk: qcom: gpucc-sc7280: Add cx collapse reset support Akhil P Oommen
2022-12-07 15:46   ` Ulf Hansson
2022-12-08 15:24     ` Akhil P Oommen
2022-12-08 21:09       ` Bjorn Andersson
2022-12-12 17:49         ` Akhil P Oommen
2022-11-07 16:54 ` [PATCH v7 0/6] clk/qcom: Support gdsc collapse polling using 'reset' interface Akhil P Oommen
2022-12-01 22:57 ` Bjorn Andersson
2022-12-02  7:00   ` [Freedreno] " Akhil P Oommen
2022-12-06 19:58     ` Akhil P Oommen
2022-12-07 16:00   ` Ulf Hansson
2022-12-07 16:54     ` Bjorn Andersson
2022-12-08 13:40       ` Ulf Hansson
2022-12-08 14:45         ` Akhil P Oommen
2022-12-08 21:06         ` Bjorn Andersson
2022-12-09 17:36           ` Ulf Hansson
2022-12-12 15:39             ` Ulf Hansson
2022-12-12 17:43               ` Akhil P Oommen
2022-12-27 18:24               ` Bjorn Andersson
2022-12-28  9:23                 ` Akhil P Oommen
2022-12-08 15:31     ` Akhil P Oommen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).