linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync()
@ 2020-05-28 14:48 Douglas Anderson
  2020-05-28 22:44 ` Stephen Boyd
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Douglas Anderson @ 2020-05-28 14:48 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson
  Cc: Maulik Shah, Douglas Anderson, Stephen Boyd, linux-arm-msm, linux-kernel

The write_tcs_reg_sync() may be called after timekeeping is suspended
so it's not OK to use ktime.  The readl_poll_timeout_atomic() macro
implicitly uses ktime.  This was causing a warning at suspend time.

Change to just loop 1000000 times with a delay of 1 us between loops.
This may give a timeout of more than 1 second but never less and is
safe even if timekeeping is suspended.

NOTE: I don't have any actual evidence that we need to loop here.
It's possibly that all we really need to do is just read the value
back to ensure that the pipes are cleaned and the looping/comparing is
totally not needed.  I never saw the loop being needed in my tests.
However, the loop shouldn't hurt.

Fixes: 91160150aba0 ("soc: qcom: rpmh-rsc: Timeout after 1 second in write_tcs_reg_sync()")
Reported-by: Maulik Shah <mkshah@codeaurora.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/soc/qcom/rpmh-rsc.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index 076fd27f3081..906778e2c1fa 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -175,13 +175,21 @@ static void write_tcs_reg(const struct rsc_drv *drv, int reg, int tcs_id,
 static void write_tcs_reg_sync(const struct rsc_drv *drv, int reg, int tcs_id,
 			       u32 data)
 {
-	u32 new_data;
+	int i;
 
 	writel(data, tcs_reg_addr(drv, reg, tcs_id));
-	if (readl_poll_timeout_atomic(tcs_reg_addr(drv, reg, tcs_id), new_data,
-				      new_data == data, 1, USEC_PER_SEC))
-		pr_err("%s: error writing %#x to %d:%#x\n", drv->name,
-		       data, tcs_id, reg);
+
+	/*
+	 * Wait until we read back the same value.  Use a counter rather than
+	 * ktime for timeout since this may be called after timekeeping stops.
+	 */
+	for (i = 0; i < USEC_PER_SEC; i++) {
+		if (readl(tcs_reg_addr(drv, reg, tcs_id)) == data)
+			return;
+		udelay(1);
+	}
+	pr_err("%s: error writing %#x to %d:%#x\n", drv->name,
+	       data, tcs_id, reg);
 }
 
 /**
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync()
  2020-05-28 14:48 [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync() Douglas Anderson
@ 2020-05-28 22:44 ` Stephen Boyd
  2020-05-29 17:00   ` Doug Anderson
  2020-06-18 21:52 ` Doug Anderson
  2020-06-24  4:41 ` Maulik Shah
  2 siblings, 1 reply; 5+ messages in thread
From: Stephen Boyd @ 2020-05-28 22:44 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson, Douglas Anderson
  Cc: Maulik Shah, Douglas Anderson, linux-arm-msm, linux-kernel

Quoting Douglas Anderson (2020-05-28 07:48:34)
> The write_tcs_reg_sync() may be called after timekeeping is suspended
> so it's not OK to use ktime.  The readl_poll_timeout_atomic() macro
> implicitly uses ktime.  This was causing a warning at suspend time.
> 
> Change to just loop 1000000 times with a delay of 1 us between loops.
> This may give a timeout of more than 1 second but never less and is
> safe even if timekeeping is suspended.
> 
> NOTE: I don't have any actual evidence that we need to loop here.
> It's possibly that all we really need to do is just read the value
> back to ensure that the pipes are cleaned and the looping/comparing is
> totally not needed.  I never saw the loop being needed in my tests.
> However, the loop shouldn't hurt.
> 
> Fixes: 91160150aba0 ("soc: qcom: rpmh-rsc: Timeout after 1 second in write_tcs_reg_sync()")
> Reported-by: Maulik Shah <mkshah@codeaurora.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---

Reviewed-by: Stephen Boyd <sboyd@kernel.org>

Although I don't think ktime_get() inside of readl_poll_timeout_atomic()
is correct. The timekeeping base won't be able to update when a loop is
spinning in an irq disabled region. We need the tick interrupt to come
in and update the base. Spinning for a second with irqs disabled is also
insane for realtime so there's that problem too. Maybe we should try to
kick timekeeping forward from these loops manually. Anyway, not problems
with this patch so not important to fix immediately.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync()
  2020-05-28 22:44 ` Stephen Boyd
@ 2020-05-29 17:00   ` Doug Anderson
  0 siblings, 0 replies; 5+ messages in thread
From: Doug Anderson @ 2020-05-29 17:00 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Andy Gross, Bjorn Andersson, Maulik Shah, linux-arm-msm, LKML

Hi,

On Thu, May 28, 2020 at 3:44 PM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Douglas Anderson (2020-05-28 07:48:34)
> > The write_tcs_reg_sync() may be called after timekeeping is suspended
> > so it's not OK to use ktime.  The readl_poll_timeout_atomic() macro
> > implicitly uses ktime.  This was causing a warning at suspend time.
> >
> > Change to just loop 1000000 times with a delay of 1 us between loops.
> > This may give a timeout of more than 1 second but never less and is
> > safe even if timekeeping is suspended.
> >
> > NOTE: I don't have any actual evidence that we need to loop here.
> > It's possibly that all we really need to do is just read the value
> > back to ensure that the pipes are cleaned and the looping/comparing is
> > totally not needed.  I never saw the loop being needed in my tests.
> > However, the loop shouldn't hurt.
> >
> > Fixes: 91160150aba0 ("soc: qcom: rpmh-rsc: Timeout after 1 second in write_tcs_reg_sync()")
> > Reported-by: Maulik Shah <mkshah@codeaurora.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
>
> Reviewed-by: Stephen Boyd <sboyd@kernel.org>

Thanks!


> Although I don't think ktime_get() inside of readl_poll_timeout_atomic()
> is correct. The timekeeping base won't be able to update when a loop is
> spinning in an irq disabled region. We need the tick interrupt to come
> in and update the base.

Is this really a problem?  I'm not totally familiar with the
timekeeping code, but I know I've used ktime to time things while
interrupts are disabled in the past.  It looks as if things are OK as
long as the base is updated every once in a while and it just does
deltas from there...


> Spinning for a second with irqs disabled is also
> insane for realtime so there's that problem too.

Yeah.  I just arbitrarily picked 1 second originally so we didn't loop
infinitely.  The expectation is that we'd never actually hit this
timeout.  If we do then there's (presumably) some type of serious
problem that needs to be debugged.


-Doug

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync()
  2020-05-28 14:48 [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync() Douglas Anderson
  2020-05-28 22:44 ` Stephen Boyd
@ 2020-06-18 21:52 ` Doug Anderson
  2020-06-24  4:41 ` Maulik Shah
  2 siblings, 0 replies; 5+ messages in thread
From: Doug Anderson @ 2020-06-18 21:52 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson
  Cc: Maulik Shah, Stephen Boyd, linux-arm-msm, LKML

Bjorn and Andy,

On Thu, May 28, 2020 at 7:48 AM Douglas Anderson <dianders@chromium.org> wrote:
>
> The write_tcs_reg_sync() may be called after timekeeping is suspended
> so it's not OK to use ktime.  The readl_poll_timeout_atomic() macro
> implicitly uses ktime.  This was causing a warning at suspend time.
>
> Change to just loop 1000000 times with a delay of 1 us between loops.
> This may give a timeout of more than 1 second but never less and is
> safe even if timekeeping is suspended.
>
> NOTE: I don't have any actual evidence that we need to loop here.
> It's possibly that all we really need to do is just read the value
> back to ensure that the pipes are cleaned and the looping/comparing is
> totally not needed.  I never saw the loop being needed in my tests.
> However, the loop shouldn't hurt.
>
> Fixes: 91160150aba0 ("soc: qcom: rpmh-rsc: Timeout after 1 second in write_tcs_reg_sync()")
> Reported-by: Maulik Shah <mkshah@codeaurora.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
>  drivers/soc/qcom/rpmh-rsc.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)

Is it a good time to land this change now that -rc1 has come out?
It'd be nice to get this resolved.

Thanks!

-Doug

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync()
  2020-05-28 14:48 [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync() Douglas Anderson
  2020-05-28 22:44 ` Stephen Boyd
  2020-06-18 21:52 ` Doug Anderson
@ 2020-06-24  4:41 ` Maulik Shah
  2 siblings, 0 replies; 5+ messages in thread
From: Maulik Shah @ 2020-06-24  4:41 UTC (permalink / raw)
  To: Douglas Anderson, Andy Gross, Bjorn Andersson
  Cc: Stephen Boyd, linux-arm-msm, linux-kernel

Reviewed-by: Maulik Shah <mkshah@codeaurora.org>

Thanks,
Maulik

On 5/28/2020 8:18 PM, Douglas Anderson wrote:
> The write_tcs_reg_sync() may be called after timekeeping is suspended
> so it's not OK to use ktime.  The readl_poll_timeout_atomic() macro
> implicitly uses ktime.  This was causing a warning at suspend time.
>
> Change to just loop 1000000 times with a delay of 1 us between loops.
> This may give a timeout of more than 1 second but never less and is
> safe even if timekeeping is suspended.
>
> NOTE: I don't have any actual evidence that we need to loop here.
> It's possibly that all we really need to do is just read the value
> back to ensure that the pipes are cleaned and the looping/comparing is
> totally not needed.  I never saw the loop being needed in my tests.
> However, the loop shouldn't hurt.
>
> Fixes: 91160150aba0 ("soc: qcom: rpmh-rsc: Timeout after 1 second in write_tcs_reg_sync()")
> Reported-by: Maulik Shah <mkshah@codeaurora.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
>   drivers/soc/qcom/rpmh-rsc.c | 18 +++++++++++++-----
>   1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
> index 076fd27f3081..906778e2c1fa 100644
> --- a/drivers/soc/qcom/rpmh-rsc.c
> +++ b/drivers/soc/qcom/rpmh-rsc.c
> @@ -175,13 +175,21 @@ static void write_tcs_reg(const struct rsc_drv *drv, int reg, int tcs_id,
>   static void write_tcs_reg_sync(const struct rsc_drv *drv, int reg, int tcs_id,
>   			       u32 data)
>   {
> -	u32 new_data;
> +	int i;
>   
>   	writel(data, tcs_reg_addr(drv, reg, tcs_id));
> -	if (readl_poll_timeout_atomic(tcs_reg_addr(drv, reg, tcs_id), new_data,
> -				      new_data == data, 1, USEC_PER_SEC))
> -		pr_err("%s: error writing %#x to %d:%#x\n", drv->name,
> -		       data, tcs_id, reg);
> +
> +	/*
> +	 * Wait until we read back the same value.  Use a counter rather than
> +	 * ktime for timeout since this may be called after timekeeping stops.
> +	 */
> +	for (i = 0; i < USEC_PER_SEC; i++) {
> +		if (readl(tcs_reg_addr(drv, reg, tcs_id)) == data)
> +			return;
> +		udelay(1);
> +	}
> +	pr_err("%s: error writing %#x to %d:%#x\n", drv->name,
> +	       data, tcs_id, reg);
>   }
>   
>   /**

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-06-24  4:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-28 14:48 [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync() Douglas Anderson
2020-05-28 22:44 ` Stephen Boyd
2020-05-29 17:00   ` Doug Anderson
2020-06-18 21:52 ` Doug Anderson
2020-06-24  4:41 ` Maulik Shah

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).