linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] interconnect: Fix sync-state issues
@ 2021-07-21 17:54 Mike Tipton
  2021-07-21 17:54 ` [PATCH v2 1/4] interconnect: Zero initial BW after sync-state Mike Tipton
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Mike Tipton @ 2021-07-21 17:54 UTC (permalink / raw)
  To: djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm, Mike Tipton

These patches fix a couple of sync-state bugs that either cause the initial BW
floors to be ignored entirely, or to be never removed after sync-state is
called.

v2:
- Move pre_aggregate call to outside the aggregate if statement

Mike Tipton (4):
  interconnect: Zero initial BW after sync-state
  interconnect: Always call pre_aggregate before aggregate
  interconnect: qcom: icc-rpmh: Ensure floor BW is enforced for all nodes
  interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate

 drivers/interconnect/core.c          |  7 +++++++
 drivers/interconnect/qcom/icc-rpmh.c | 20 ++++++++++----------
 2 files changed, 17 insertions(+), 10 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/4] interconnect: Zero initial BW after sync-state
  2021-07-21 17:54 [PATCH v2 0/4] interconnect: Fix sync-state issues Mike Tipton
@ 2021-07-21 17:54 ` Mike Tipton
  2021-07-21 17:54 ` [PATCH v2 2/4] interconnect: Always call pre_aggregate before aggregate Mike Tipton
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Mike Tipton @ 2021-07-21 17:54 UTC (permalink / raw)
  To: djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm, Mike Tipton

The initial BW values may be used by providers to enforce floors. Zero
these values after sync-state so that providers know when to stop
enforcing them.

Fixes: b1d681d8d324 ("interconnect: Add sync state support")
Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
---
 drivers/interconnect/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index 8a1e70e00876..945121e18b5c 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -1106,6 +1106,8 @@ void icc_sync_state(struct device *dev)
 		dev_dbg(p->dev, "interconnect provider is in synced state\n");
 		list_for_each_entry(n, &p->nodes, node_list) {
 			if (n->init_avg || n->init_peak) {
+				n->init_avg = 0;
+				n->init_peak = 0;
 				aggregate_requests(n);
 				p->set(n, n);
 			}
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/4] interconnect: Always call pre_aggregate before aggregate
  2021-07-21 17:54 [PATCH v2 0/4] interconnect: Fix sync-state issues Mike Tipton
  2021-07-21 17:54 ` [PATCH v2 1/4] interconnect: Zero initial BW after sync-state Mike Tipton
@ 2021-07-21 17:54 ` Mike Tipton
  2021-07-21 17:54 ` [PATCH v2 3/4] interconnect: qcom: icc-rpmh: Ensure floor BW is enforced for all nodes Mike Tipton
  2021-07-21 17:54 ` [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate Mike Tipton
  3 siblings, 0 replies; 12+ messages in thread
From: Mike Tipton @ 2021-07-21 17:54 UTC (permalink / raw)
  To: djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm, Mike Tipton

The pre_aggregate callback isn't called in all cases before calling
aggregate. Add the missing calls so providers can rely on consistent
framework behavior.

Fixes: d3703b3e255f ("interconnect: Aggregate before setting initial bandwidth")
Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
---
 drivers/interconnect/core.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index 945121e18b5c..1b2c564eaa99 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -973,9 +973,14 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
 	}
 	node->avg_bw = node->init_avg;
 	node->peak_bw = node->init_peak;
+
+	if (provider->pre_aggregate)
+		provider->pre_aggregate(node);
+
 	if (provider->aggregate)
 		provider->aggregate(node, 0, node->init_avg, node->init_peak,
 				    &node->avg_bw, &node->peak_bw);
+
 	provider->set(node, node);
 	node->avg_bw = 0;
 	node->peak_bw = 0;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/4] interconnect: qcom: icc-rpmh: Ensure floor BW is enforced for all nodes
  2021-07-21 17:54 [PATCH v2 0/4] interconnect: Fix sync-state issues Mike Tipton
  2021-07-21 17:54 ` [PATCH v2 1/4] interconnect: Zero initial BW after sync-state Mike Tipton
  2021-07-21 17:54 ` [PATCH v2 2/4] interconnect: Always call pre_aggregate before aggregate Mike Tipton
@ 2021-07-21 17:54 ` Mike Tipton
  2021-07-21 17:54 ` [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate Mike Tipton
  3 siblings, 0 replies; 12+ messages in thread
From: Mike Tipton @ 2021-07-21 17:54 UTC (permalink / raw)
  To: djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm, Mike Tipton

We currently only enforce BW floors for a subset of nodes in a path.
All BCMs that need updating are queued in the pre_aggregate/aggregate
phase. The first set() commits all queued BCMs and subsequent set()
calls short-circuit without committing anything. Since the floor BW
isn't set in sum_avg/max_peak until set(), then some BCMs are committed
before their associated nodes reflect the floor.

Set the floor as each node is being aggregated. This ensures that all
all relevant floors are set before the BCMs are committed.

Fixes: 266cd33b5913 ("interconnect: qcom: Ensure that the floor bandwidth value is enforced")
Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
---
 drivers/interconnect/qcom/icc-rpmh.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/interconnect/qcom/icc-rpmh.c b/drivers/interconnect/qcom/icc-rpmh.c
index bf01d09dba6c..f118f57eae37 100644
--- a/drivers/interconnect/qcom/icc-rpmh.c
+++ b/drivers/interconnect/qcom/icc-rpmh.c
@@ -57,6 +57,11 @@ int qcom_icc_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
 			qn->sum_avg[i] += avg_bw;
 			qn->max_peak[i] = max_t(u32, qn->max_peak[i], peak_bw);
 		}
+
+		if (node->init_avg || node->init_peak) {
+			qn->sum_avg[i] = max_t(u64, qn->sum_avg[i], node->init_avg);
+			qn->max_peak[i] = max_t(u64, qn->max_peak[i], node->init_peak);
+		}
 	}
 
 	*agg_avg += avg_bw;
@@ -90,11 +95,6 @@ int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
 	qp = to_qcom_provider(node->provider);
 	qn = node->data;
 
-	qn->sum_avg[QCOM_ICC_BUCKET_AMC] = max_t(u64, qn->sum_avg[QCOM_ICC_BUCKET_AMC],
-						 node->avg_bw);
-	qn->max_peak[QCOM_ICC_BUCKET_AMC] = max_t(u64, qn->max_peak[QCOM_ICC_BUCKET_AMC],
-						  node->peak_bw);
-
 	qcom_icc_bcm_voter_commit(qp->voter);
 
 	return 0;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  2021-07-21 17:54 [PATCH v2 0/4] interconnect: Fix sync-state issues Mike Tipton
                   ` (2 preceding siblings ...)
  2021-07-21 17:54 ` [PATCH v2 3/4] interconnect: qcom: icc-rpmh: Ensure floor BW is enforced for all nodes Mike Tipton
@ 2021-07-21 17:54 ` Mike Tipton
  2021-08-10 23:31   ` Stephen Boyd
  3 siblings, 1 reply; 12+ messages in thread
From: Mike Tipton @ 2021-07-21 17:54 UTC (permalink / raw)
  To: djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm, Mike Tipton

We're only adding BCMs to the commit list in aggregate(), but there are
cases where pre_aggregate() is called without subsequently calling
aggregate(). In particular, in icc_sync_state() when a node with initial
BW has zero requests. Since BCMs aren't added to the commit list in
these cases, we don't actually send the zero BW request to HW. So the
resources remain on unnecessarily.

Add BCMs to the commit list in pre_aggregate() instead, which is always
called even when there are no requests.

Fixes: 976daac4a1c5 ("interconnect: qcom: Consolidate interconnect RPMh support")
Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
---
 drivers/interconnect/qcom/icc-rpmh.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/interconnect/qcom/icc-rpmh.c b/drivers/interconnect/qcom/icc-rpmh.c
index f118f57eae37..b26fda0588e0 100644
--- a/drivers/interconnect/qcom/icc-rpmh.c
+++ b/drivers/interconnect/qcom/icc-rpmh.c
@@ -20,13 +20,18 @@ void qcom_icc_pre_aggregate(struct icc_node *node)
 {
 	size_t i;
 	struct qcom_icc_node *qn;
+	struct qcom_icc_provider *qp;
 
 	qn = node->data;
+	qp = to_qcom_provider(node->provider);
 
 	for (i = 0; i < QCOM_ICC_NUM_BUCKETS; i++) {
 		qn->sum_avg[i] = 0;
 		qn->max_peak[i] = 0;
 	}
+
+	for (i = 0; i < qn->num_bcms; i++)
+		qcom_icc_bcm_voter_add(qp->voter, qn->bcms[i]);
 }
 EXPORT_SYMBOL_GPL(qcom_icc_pre_aggregate);
 
@@ -44,10 +49,8 @@ int qcom_icc_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
 {
 	size_t i;
 	struct qcom_icc_node *qn;
-	struct qcom_icc_provider *qp;
 
 	qn = node->data;
-	qp = to_qcom_provider(node->provider);
 
 	if (!tag)
 		tag = QCOM_ICC_TAG_ALWAYS;
@@ -67,9 +70,6 @@ int qcom_icc_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
 	*agg_avg += avg_bw;
 	*agg_peak = max_t(u32, *agg_peak, peak_bw);
 
-	for (i = 0; i < qn->num_bcms; i++)
-		qcom_icc_bcm_voter_add(qp->voter, qn->bcms[i]);
-
 	return 0;
 }
 EXPORT_SYMBOL_GPL(qcom_icc_aggregate);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  2021-07-21 17:54 ` [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate Mike Tipton
@ 2021-08-10 23:31   ` Stephen Boyd
  2021-08-11  0:18     ` Bjorn Andersson
  2021-08-11 16:01     ` Alex Elder
  0 siblings, 2 replies; 12+ messages in thread
From: Stephen Boyd @ 2021-08-10 23:31 UTC (permalink / raw)
  To: Mike Tipton, djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm, Alex Elder

Quoting Mike Tipton (2021-07-21 10:54:32)
> We're only adding BCMs to the commit list in aggregate(), but there are
> cases where pre_aggregate() is called without subsequently calling
> aggregate(). In particular, in icc_sync_state() when a node with initial
> BW has zero requests. Since BCMs aren't added to the commit list in
> these cases, we don't actually send the zero BW request to HW. So the
> resources remain on unnecessarily.
>
> Add BCMs to the commit list in pre_aggregate() instead, which is always
> called even when there are no requests.
>
> Fixes: 976daac4a1c5 ("interconnect: qcom: Consolidate interconnect RPMh support")
> Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
> ---

This patch breaks reboot for me on sc7180 Lazor

[  107.136454] kvm: exiting hardware virtualization
[  107.163741] platform video-firmware.0: Removing from iommu group 13
[  107.193412] SError Interrupt on CPU1, code 0xbe000011 -- SError
[  107.193428] CPU: 1 PID: 4289 Comm: reboot Not tainted 5.14.0-rc1+ #12
[  107.193432] Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
[  107.193436] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO BTYPE=--)
[  107.193440] pc : el1_interrupt+0x20/0x60
[  107.193443] lr : el1h_64_irq_handler+0x18/0x24
[  107.193445] sp : ffffffc014093a10
[  107.193448] x29: ffffffc014093a10 x28: ffffff8088295ec0 x27: 0000000000000000
[  107.193465] x26: ffffff8080ed4c18 x25: ffffffd0beece000 x24: ffffffd0bef45000
[  107.193476] x23: 0000000060400009 x22: ffffffd0be0bc1a0 x21: ffffffc014093b90
[  107.193487] x20: ffffffd0bdc100f8 x19: ffffffc014093a40 x18: 000000000007d829
[  107.193497] x17: ffffffd067412b54 x16: ffffffd0be0bc164 x15: ffffffd067413d0c
[  107.193507] x14: ffffffd0bdd24fa4 x13: ffffffd0bdc26180 x12: ffffffd0bdc26260
[  107.193517] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
[  107.193528] x8 : 00000000000000c0 x7 : bbbbbbbbbbbbbbbb x6 : ffffffd0bde488dc
[  107.193539] x5 : 0000000000200017 x4 : ffffff809b5c4b40 x3 : 0000000000200018
[  107.193549] x2 : ffffff8088295ec0 x1 : ffffffd0bdc100f8 x0 : ffffffc014093a40
[  107.193561] Kernel panic - not syncing: Asynchronous SError Interrupt
[  107.193564] CPU: 1 PID: 4289 Comm: reboot Not tainted 5.14.0-rc1+ #12
[  107.193567] Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
[  107.193570] Call trace:
[  107.193573]  dump_backtrace+0x0/0x1c8
[  107.193577]  show_stack+0x24/0x30
[  107.193579]  dump_stack_lvl+0x64/0x7c
[  107.193582]  dump_stack+0x18/0x38
[  107.193584]  panic+0x158/0x39c
[  107.193586]  nmi_panic+0x88/0xa0
[  107.193589]  arm64_serror_panic+0x80/0x8c
[  107.193593]  do_serror+0x0/0x80
[  107.193595]  do_serror+0x58/0x80
[  107.193597]  el1h_64_error_handler+0x30/0x48
[  107.193601]  el1h_64_error+0x78/0x7c
[  107.193603]  el1_interrupt+0x20/0x60
[  107.193606]  el1h_64_irq_handler+0x18/0x24
[  107.193609]  el1h_64_irq+0x78/0x7c
[  107.193612]  refcount_dec_and_mutex_lock+0x3c/0xb4
[  107.193616]  ipa_clock_put+0x34/0x74 [ipa]
[  107.193619]  ipa_deconfig+0x64/0x74 [ipa]
[  107.193622]  ipa_remove+0xbc/0x110 [ipa]
[  107.193625]  ipa_shutdown+0x24/0x50 [ipa]
[  107.193628]  platform_shutdown+0x30/0x3c
[  107.193631]  device_shutdown+0x150/0x208
[  107.193633]  kernel_restart_prepare+0x44/0x50
[  107.193637]  kernel_restart+0x24/0x70
[  107.193640]  __arm64_sys_reboot+0x188/0x230
[  107.193643]  invoke_syscall+0x4c/0x120
[  107.193646]  el0_svc_common+0x84/0xe0
[  107.193648]  do_el0_svc_compat+0x2c/0x38
[  107.193651]  el0_svc_compat+0x20/0x30
[  107.193654]  el0t_32_sync_handler+0xc0/0xf0
[  107.193657]  el0t_32_sync+0x19c/0x1a0

Presumably some sort of interconnect is getting turned off earlier than
before?

>  drivers/interconnect/qcom/icc-rpmh.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/interconnect/qcom/icc-rpmh.c b/drivers/interconnect/qcom/icc-rpmh.c
> index f118f57eae37..b26fda0588e0 100644
> --- a/drivers/interconnect/qcom/icc-rpmh.c
> +++ b/drivers/interconnect/qcom/icc-rpmh.c
> @@ -20,13 +20,18 @@ void qcom_icc_pre_aggregate(struct icc_node *node)
>  {
>         size_t i;
>         struct qcom_icc_node *qn;
> +       struct qcom_icc_provider *qp;
>
>         qn = node->data;
> +       qp = to_qcom_provider(node->provider);
>
>         for (i = 0; i < QCOM_ICC_NUM_BUCKETS; i++) {
>                 qn->sum_avg[i] = 0;
>                 qn->max_peak[i] = 0;
>         }
> +
> +       for (i = 0; i < qn->num_bcms; i++)
> +               qcom_icc_bcm_voter_add(qp->voter, qn->bcms[i]);
>  }
>  EXPORT_SYMBOL_GPL(qcom_icc_pre_aggregate);
>
> @@ -44,10 +49,8 @@ int qcom_icc_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
>  {
>         size_t i;
>         struct qcom_icc_node *qn;
> -       struct qcom_icc_provider *qp;
>
>         qn = node->data;
> -       qp = to_qcom_provider(node->provider);
>
>         if (!tag)
>                 tag = QCOM_ICC_TAG_ALWAYS;
> @@ -67,9 +70,6 @@ int qcom_icc_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
>         *agg_avg += avg_bw;
>         *agg_peak = max_t(u32, *agg_peak, peak_bw);
>
> -       for (i = 0; i < qn->num_bcms; i++)
> -               qcom_icc_bcm_voter_add(qp->voter, qn->bcms[i]);
> -
>         return 0;
>  }
>  EXPORT_SYMBOL_GPL(qcom_icc_aggregate);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  2021-08-10 23:31   ` Stephen Boyd
@ 2021-08-11  0:18     ` Bjorn Andersson
  2021-08-11  4:22       ` Stephen Boyd
  2021-08-18  4:43       ` Mike Tipton
  2021-08-11 16:01     ` Alex Elder
  1 sibling, 2 replies; 12+ messages in thread
From: Bjorn Andersson @ 2021-08-11  0:18 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Mike Tipton, djakov, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm, Alex Elder

On Tue 10 Aug 18:31 CDT 2021, Stephen Boyd wrote:

> Quoting Mike Tipton (2021-07-21 10:54:32)
> > We're only adding BCMs to the commit list in aggregate(), but there are
> > cases where pre_aggregate() is called without subsequently calling
> > aggregate(). In particular, in icc_sync_state() when a node with initial
> > BW has zero requests. Since BCMs aren't added to the commit list in
> > these cases, we don't actually send the zero BW request to HW. So the
> > resources remain on unnecessarily.
> >
> > Add BCMs to the commit list in pre_aggregate() instead, which is always
> > called even when there are no requests.
> >
> > Fixes: 976daac4a1c5 ("interconnect: qcom: Consolidate interconnect RPMh support")
> > Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
> > ---
> 
> This patch breaks reboot for me on sc7180 Lazor
> 

FWIW, it prevents at least SM8150 from booting (need to check my other
boards as well), because its no longer okay to have the interconnect
providers defined without having all client paths specified.

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  2021-08-11  0:18     ` Bjorn Andersson
@ 2021-08-11  4:22       ` Stephen Boyd
  2021-08-18  4:43       ` Mike Tipton
  1 sibling, 0 replies; 12+ messages in thread
From: Stephen Boyd @ 2021-08-11  4:22 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Mike Tipton, djakov, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm, Alex Elder

Quoting Bjorn Andersson (2021-08-10 17:18:02)
> On Tue 10 Aug 18:31 CDT 2021, Stephen Boyd wrote:
>
> > Quoting Mike Tipton (2021-07-21 10:54:32)
> > > We're only adding BCMs to the commit list in aggregate(), but there are
> > > cases where pre_aggregate() is called without subsequently calling
> > > aggregate(). In particular, in icc_sync_state() when a node with initial
> > > BW has zero requests. Since BCMs aren't added to the commit list in
> > > these cases, we don't actually send the zero BW request to HW. So the
> > > resources remain on unnecessarily.
> > >
> > > Add BCMs to the commit list in pre_aggregate() instead, which is always
> > > called even when there are no requests.
> > >
> > > Fixes: 976daac4a1c5 ("interconnect: qcom: Consolidate interconnect RPMh support")
> > > Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
> > > ---
> >
> > This patch breaks reboot for me on sc7180 Lazor
> >
>
> FWIW, it prevents at least SM8150 from booting (need to check my other
> boards as well), because its no longer okay to have the interconnect
> providers defined without having all client paths specified.

So maybe the best course of action is to revert this patch from Linus'
tree? It's not a super huge deal as "can't boot", but certainly makes
reboot annoying on sc7180.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  2021-08-10 23:31   ` Stephen Boyd
  2021-08-11  0:18     ` Bjorn Andersson
@ 2021-08-11 16:01     ` Alex Elder
  2021-08-11 18:13       ` Stephen Boyd
  1 sibling, 1 reply; 12+ messages in thread
From: Alex Elder @ 2021-08-11 16:01 UTC (permalink / raw)
  To: Stephen Boyd, Mike Tipton, djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm

On 8/10/21 6:31 PM, Stephen Boyd wrote:
> Quoting Mike Tipton (2021-07-21 10:54:32)
>> We're only adding BCMs to the commit list in aggregate(), but there are
>> cases where pre_aggregate() is called without subsequently calling
>> aggregate(). In particular, in icc_sync_state() when a node with initial
>> BW has zero requests. Since BCMs aren't added to the commit list in
>> these cases, we don't actually send the zero BW request to HW. So the
>> resources remain on unnecessarily.
>>
>> Add BCMs to the commit list in pre_aggregate() instead, which is always
>> called even when there are no requests.
>>
>> Fixes: 976daac4a1c5 ("interconnect: qcom: Consolidate interconnect RPMh support")
>> Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
>> ---
> 
> This patch breaks reboot for me on sc7180 Lazor

If I am using the interface improperly or something in the
IPA driver, please let me know.  I actually plan to switch
to using the bulk interfaces soon (FYI).

Thanks.

					-Alex

> [  107.136454] kvm: exiting hardware virtualization
> [  107.163741] platform video-firmware.0: Removing from iommu group 13
> [  107.193412] SError Interrupt on CPU1, code 0xbe000011 -- SError
> [  107.193428] CPU: 1 PID: 4289 Comm: reboot Not tainted 5.14.0-rc1+ #12
> [  107.193432] Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
> [  107.193436] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO BTYPE=--)
> [  107.193440] pc : el1_interrupt+0x20/0x60
> [  107.193443] lr : el1h_64_irq_handler+0x18/0x24
> [  107.193445] sp : ffffffc014093a10
> [  107.193448] x29: ffffffc014093a10 x28: ffffff8088295ec0 x27: 0000000000000000
> [  107.193465] x26: ffffff8080ed4c18 x25: ffffffd0beece000 x24: ffffffd0bef45000
> [  107.193476] x23: 0000000060400009 x22: ffffffd0be0bc1a0 x21: ffffffc014093b90
> [  107.193487] x20: ffffffd0bdc100f8 x19: ffffffc014093a40 x18: 000000000007d829
> [  107.193497] x17: ffffffd067412b54 x16: ffffffd0be0bc164 x15: ffffffd067413d0c
> [  107.193507] x14: ffffffd0bdd24fa4 x13: ffffffd0bdc26180 x12: ffffffd0bdc26260
> [  107.193517] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
> [  107.193528] x8 : 00000000000000c0 x7 : bbbbbbbbbbbbbbbb x6 : ffffffd0bde488dc
> [  107.193539] x5 : 0000000000200017 x4 : ffffff809b5c4b40 x3 : 0000000000200018
> [  107.193549] x2 : ffffff8088295ec0 x1 : ffffffd0bdc100f8 x0 : ffffffc014093a40
> [  107.193561] Kernel panic - not syncing: Asynchronous SError Interrupt
> [  107.193564] CPU: 1 PID: 4289 Comm: reboot Not tainted 5.14.0-rc1+ #12
> [  107.193567] Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
> [  107.193570] Call trace:
> [  107.193573]  dump_backtrace+0x0/0x1c8
> [  107.193577]  show_stack+0x24/0x30
> [  107.193579]  dump_stack_lvl+0x64/0x7c
> [  107.193582]  dump_stack+0x18/0x38
> [  107.193584]  panic+0x158/0x39c
> [  107.193586]  nmi_panic+0x88/0xa0
> [  107.193589]  arm64_serror_panic+0x80/0x8c
> [  107.193593]  do_serror+0x0/0x80
> [  107.193595]  do_serror+0x58/0x80
> [  107.193597]  el1h_64_error_handler+0x30/0x48
> [  107.193601]  el1h_64_error+0x78/0x7c
> [  107.193603]  el1_interrupt+0x20/0x60
> [  107.193606]  el1h_64_irq_handler+0x18/0x24
> [  107.193609]  el1h_64_irq+0x78/0x7c
> [  107.193612]  refcount_dec_and_mutex_lock+0x3c/0xb4
> [  107.193616]  ipa_clock_put+0x34/0x74 [ipa]
> [  107.193619]  ipa_deconfig+0x64/0x74 [ipa]
> [  107.193622]  ipa_remove+0xbc/0x110 [ipa]
> [  107.193625]  ipa_shutdown+0x24/0x50 [ipa]
> [  107.193628]  platform_shutdown+0x30/0x3c
> [  107.193631]  device_shutdown+0x150/0x208
> [  107.193633]  kernel_restart_prepare+0x44/0x50
> [  107.193637]  kernel_restart+0x24/0x70
> [  107.193640]  __arm64_sys_reboot+0x188/0x230
> [  107.193643]  invoke_syscall+0x4c/0x120
> [  107.193646]  el0_svc_common+0x84/0xe0
> [  107.193648]  do_el0_svc_compat+0x2c/0x38
> [  107.193651]  el0_svc_compat+0x20/0x30
> [  107.193654]  el0t_32_sync_handler+0xc0/0xf0
> [  107.193657]  el0t_32_sync+0x19c/0x1a0
> 
> Presumably some sort of interconnect is getting turned off earlier than
> before?
> 
>>   drivers/interconnect/qcom/icc-rpmh.c | 10 +++++-----
>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/interconnect/qcom/icc-rpmh.c b/drivers/interconnect/qcom/icc-rpmh.c
>> index f118f57eae37..b26fda0588e0 100644
>> --- a/drivers/interconnect/qcom/icc-rpmh.c
>> +++ b/drivers/interconnect/qcom/icc-rpmh.c
>> @@ -20,13 +20,18 @@ void qcom_icc_pre_aggregate(struct icc_node *node)
>>   {
>>          size_t i;
>>          struct qcom_icc_node *qn;
>> +       struct qcom_icc_provider *qp;
>>
>>          qn = node->data;
>> +       qp = to_qcom_provider(node->provider);
>>
>>          for (i = 0; i < QCOM_ICC_NUM_BUCKETS; i++) {
>>                  qn->sum_avg[i] = 0;
>>                  qn->max_peak[i] = 0;
>>          }
>> +
>> +       for (i = 0; i < qn->num_bcms; i++)
>> +               qcom_icc_bcm_voter_add(qp->voter, qn->bcms[i]);
>>   }
>>   EXPORT_SYMBOL_GPL(qcom_icc_pre_aggregate);
>>
>> @@ -44,10 +49,8 @@ int qcom_icc_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
>>   {
>>          size_t i;
>>          struct qcom_icc_node *qn;
>> -       struct qcom_icc_provider *qp;
>>
>>          qn = node->data;
>> -       qp = to_qcom_provider(node->provider);
>>
>>          if (!tag)
>>                  tag = QCOM_ICC_TAG_ALWAYS;
>> @@ -67,9 +70,6 @@ int qcom_icc_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
>>          *agg_avg += avg_bw;
>>          *agg_peak = max_t(u32, *agg_peak, peak_bw);
>>
>> -       for (i = 0; i < qn->num_bcms; i++)
>> -               qcom_icc_bcm_voter_add(qp->voter, qn->bcms[i]);
>> -
>>          return 0;
>>   }
>>   EXPORT_SYMBOL_GPL(qcom_icc_aggregate);


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  2021-08-11 16:01     ` Alex Elder
@ 2021-08-11 18:13       ` Stephen Boyd
  2021-08-18  4:43         ` Mike Tipton
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen Boyd @ 2021-08-11 18:13 UTC (permalink / raw)
  To: Alex Elder, Mike Tipton, djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm

Quoting Alex Elder (2021-08-11 09:01:27)
> On 8/10/21 6:31 PM, Stephen Boyd wrote:
> > Quoting Mike Tipton (2021-07-21 10:54:32)
> >> We're only adding BCMs to the commit list in aggregate(), but there are
> >> cases where pre_aggregate() is called without subsequently calling
> >> aggregate(). In particular, in icc_sync_state() when a node with initial
> >> BW has zero requests. Since BCMs aren't added to the commit list in
> >> these cases, we don't actually send the zero BW request to HW. So the
> >> resources remain on unnecessarily.
> >>
> >> Add BCMs to the commit list in pre_aggregate() instead, which is always
> >> called even when there are no requests.
> >>
> >> Fixes: 976daac4a1c5 ("interconnect: qcom: Consolidate interconnect RPMh support")
> >> Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
> >> ---
> >
> > This patch breaks reboot for me on sc7180 Lazor
>
> If I am using the interface improperly or something in the
> IPA driver, please let me know.  I actually plan to switch
> to using the bulk interfaces soon (FYI).
>

I suspect I'm seeing a shutdown ordering issue, where we start dropping
interconnect requests in driver shutdown callbacks and then some bus
turns off and the CPU can't access a device. Maybe to fix this problem
(if reverting isn't an option) would be to add a shutdown hook to
rpmh-icc that effectively "props up" the bandwidth requests during
shutdown so that we don't have to think about finding the place that the
interconnect is turned off. We're shutting down/restarting anyway, so
there isn't much point in trying to be power efficient for the last few
moments of runtime.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  2021-08-11  0:18     ` Bjorn Andersson
  2021-08-11  4:22       ` Stephen Boyd
@ 2021-08-18  4:43       ` Mike Tipton
  1 sibling, 0 replies; 12+ messages in thread
From: Mike Tipton @ 2021-08-18  4:43 UTC (permalink / raw)
  To: Bjorn Andersson, Stephen Boyd
  Cc: djakov, agross, saravanak, okukatla, linux-pm, linux-kernel,
	linux-arm-msm, Alex Elder

On 8/10/2021 5:18 PM, Bjorn Andersson wrote:
> On Tue 10 Aug 18:31 CDT 2021, Stephen Boyd wrote:
> 
>> Quoting Mike Tipton (2021-07-21 10:54:32)
>>> We're only adding BCMs to the commit list in aggregate(), but there are
>>> cases where pre_aggregate() is called without subsequently calling
>>> aggregate(). In particular, in icc_sync_state() when a node with initial
>>> BW has zero requests. Since BCMs aren't added to the commit list in
>>> these cases, we don't actually send the zero BW request to HW. So the
>>> resources remain on unnecessarily.
>>>
>>> Add BCMs to the commit list in pre_aggregate() instead, which is always
>>> called even when there are no requests.
>>>
>>> Fixes: 976daac4a1c5 ("interconnect: qcom: Consolidate interconnect RPMh support")
>>> Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
>>> ---
>>
>> This patch breaks reboot for me on sc7180 Lazor
>>
> 
> FWIW, it prevents at least SM8150 from booting (need to check my other
> boards as well), because its no longer okay to have the interconnect
> providers defined without having all client paths specified.

My testing was limited to sdm845, which didn't show any boot issues. But 
it's not terribly surprising for this to cause problems on some targets. 
Previously every node was enabled by default and left on permanently if 
nobody explicitly voted for them. This would happen even if these nodes 
weren't enabled in bootloaders, since most of the qcom providers aren't 
defining a get_bw() callback and thus the framework defaults 
init_avg/init_peak to INT_MAX. So any drivers relying on this default-on 
behavior would break.

We can try to get dumps of the NOC error registers at the time of 
failure to pinpoint the problematic access. Or we could try to narrow it 
down by marking more BCMs as keepalive. If they're marked as keepalive 
then we won't let them turn off even with this patch.

> 
> Regards,
> Bjorn
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  2021-08-11 18:13       ` Stephen Boyd
@ 2021-08-18  4:43         ` Mike Tipton
  0 siblings, 0 replies; 12+ messages in thread
From: Mike Tipton @ 2021-08-18  4:43 UTC (permalink / raw)
  To: Stephen Boyd, Alex Elder, djakov
  Cc: bjorn.andersson, agross, saravanak, okukatla, linux-pm,
	linux-kernel, linux-arm-msm

On 8/11/2021 11:13 AM, Stephen Boyd wrote:
> Quoting Alex Elder (2021-08-11 09:01:27)
>> On 8/10/21 6:31 PM, Stephen Boyd wrote:
>>> Quoting Mike Tipton (2021-07-21 10:54:32)
>>>> We're only adding BCMs to the commit list in aggregate(), but there are
>>>> cases where pre_aggregate() is called without subsequently calling
>>>> aggregate(). In particular, in icc_sync_state() when a node with initial
>>>> BW has zero requests. Since BCMs aren't added to the commit list in
>>>> these cases, we don't actually send the zero BW request to HW. So the
>>>> resources remain on unnecessarily.
>>>>
>>>> Add BCMs to the commit list in pre_aggregate() instead, which is always
>>>> called even when there are no requests.
>>>>
>>>> Fixes: 976daac4a1c5 ("interconnect: qcom: Consolidate interconnect RPMh support")
>>>> Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
>>>> ---
>>>
>>> This patch breaks reboot for me on sc7180 Lazor
>>
>> If I am using the interface improperly or something in the
>> IPA driver, please let me know.  I actually plan to switch
>> to using the bulk interfaces soon (FYI).
>>
> 
> I suspect I'm seeing a shutdown ordering issue, where we start dropping
> interconnect requests in driver shutdown callbacks and then some bus
> turns off and the CPU can't access a device. Maybe to fix this problem
> (if reverting isn't an option) would be to add a shutdown hook to
> rpmh-icc that effectively "props up" the bandwidth requests during
> shutdown so that we don't have to think about finding the place that the
> interconnect is turned off. We're shutting down/restarting anyway, so
> there isn't much point in trying to be power efficient for the last few
> moments of runtime.
> 

I wouldn't have expected this change to impact reboot, since this change 
should only impact places where pre_aggregate() is called without 
subsequently calling aggregate(). I don't think there are currently any 
places that can happen other than icc_sync_state().

I suppose what could be happening is we're now disabling certain paths 
in icc_sync_state() and their associated drivers just aren't used or 
attempting accesses until they're being torn down in reboot. That 
doesn't seem particularly likely, but nothing else immediately comes to 
mind.

We already mark paths critical for the CPU as "keepalive" such that 
they'll never turn off. This includes the CPU path to DDR and top-level 
CSRs. Basically just paths that can't actually be turned off while SW is 
running. That logic is unchanged in this patch. So we generally 
shouldn't need any shutdown-specific callbacks to place BW votes during 
this window. Client drivers should still ensure they're sequencing their 
shutdown logic such that any bus accesses happen before they remove 
their BW requests.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-08-18  4:44 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21 17:54 [PATCH v2 0/4] interconnect: Fix sync-state issues Mike Tipton
2021-07-21 17:54 ` [PATCH v2 1/4] interconnect: Zero initial BW after sync-state Mike Tipton
2021-07-21 17:54 ` [PATCH v2 2/4] interconnect: Always call pre_aggregate before aggregate Mike Tipton
2021-07-21 17:54 ` [PATCH v2 3/4] interconnect: qcom: icc-rpmh: Ensure floor BW is enforced for all nodes Mike Tipton
2021-07-21 17:54 ` [PATCH v2 4/4] interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate Mike Tipton
2021-08-10 23:31   ` Stephen Boyd
2021-08-11  0:18     ` Bjorn Andersson
2021-08-11  4:22       ` Stephen Boyd
2021-08-18  4:43       ` Mike Tipton
2021-08-11 16:01     ` Alex Elder
2021-08-11 18:13       ` Stephen Boyd
2021-08-18  4:43         ` Mike Tipton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).