All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes
@ 2023-08-31  1:14 Stephen Boyd
  2023-08-31  1:14 ` [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop() Stephen Boyd
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Stephen Boyd @ 2023-08-31  1:14 UTC (permalink / raw)
  To: Mika Westerberg, Hans de Goede, Mark Gross
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Prashant Malani, Kuppuswamy Sathyanarayanan

I recently looked at some crash reports on ChromeOS devices that call
into this intel_scu_ipc driver. They were hitting timeouts, and it
certainly looks possible for those timeouts to be triggering because of
scheduling issues. Once things started going south, the timeouts kept
coming. Maybe that's because the other side got seriously confused? I
don't know. I'll poke at it some more by injecting timeouts on the
kernel side.

The first two patches are only lightly tested (normal functions keep
working), while the third one is purely speculation. I was going to make
the interrupt delay for a long time to see if I could hit the timeout.

Stephen Boyd (3):
  platform/x86: intel_scu_ipc: Check status after timeouts in
    busy_loop()
  platform/x86: intel_scu_ipc: Check status upon timeout in
    ipc_wait_for_interrupt()
  platform/x86: intel_scu_ipc: Fail IPC send if still busy

 drivers/platform/x86/intel_scu_ipc.c | 59 ++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 17 deletions(-)


base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
-- 
https://chromeos.dev


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-08-31  1:14 [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Stephen Boyd
@ 2023-08-31  1:14 ` Stephen Boyd
  2023-08-31 13:53   ` Andy Shevchenko
                     ` (2 more replies)
  2023-08-31  1:14 ` [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt() Stephen Boyd
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 22+ messages in thread
From: Stephen Boyd @ 2023-08-31  1:14 UTC (permalink / raw)
  To: Mika Westerberg, Hans de Goede, Mark Gross
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Kuppuswamy Sathyanarayanan, Prashant Malani

It's possible for the polling loop in busy_loop() to get scheduled away
for a long time.

  status = ipc_read_status(scu);
  <long time scheduled away>
  if (!(status & IPC_STATUS_BUSY))

If this happens, then the status bit could change and this function
would never test it again after checking the jiffies against the timeout
limit. Polling code should check the condition one more time after the
timeout in case this happens.

The read_poll_timeout() helper implements this logic, and is shorter, so
simply use that helper here.

Cc: Prashant Malani <pmalani@chromium.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Fixes: e7b7ab3847c9 ("platform/x86: intel_scu_ipc: Sleeping is fine when polling")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
---
 drivers/platform/x86/intel_scu_ipc.c | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c
index 6851d10d6582..5a37becc65aa 100644
--- a/drivers/platform/x86/intel_scu_ipc.c
+++ b/drivers/platform/x86/intel_scu_ipc.c
@@ -19,6 +19,7 @@
 #include <linux/init.h>
 #include <linux/interrupt.h>
 #include <linux/io.h>
+#include <linux/iopoll.h>
 #include <linux/module.h>
 #include <linux/slab.h>
 
@@ -231,19 +232,15 @@ static inline u32 ipc_data_readl(struct intel_scu_ipc_dev *scu, u32 offset)
 /* Wait till scu status is busy */
 static inline int busy_loop(struct intel_scu_ipc_dev *scu)
 {
-	unsigned long end = jiffies + IPC_TIMEOUT;
-
-	do {
-		u32 status;
-
-		status = ipc_read_status(scu);
-		if (!(status & IPC_STATUS_BUSY))
-			return (status & IPC_STATUS_ERR) ? -EIO : 0;
+	u8 status;
+	int err;
 
-		usleep_range(50, 100);
-	} while (time_before(jiffies, end));
+	err = read_poll_timeout(ipc_read_status, status, !(status & IPC_STATUS_BUSY),
+				100, jiffies_to_usecs(IPC_TIMEOUT), false, scu);
+	if (err)
+		return err;
 
-	return -ETIMEDOUT;
+	return (status & IPC_STATUS_ERR) ? -EIO : 0;
 }
 
 /* Wait till ipc ioc interrupt is received or timeout in 10 HZ */
-- 
https://chromeos.dev


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
  2023-08-31  1:14 [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Stephen Boyd
  2023-08-31  1:14 ` [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop() Stephen Boyd
@ 2023-08-31  1:14 ` Stephen Boyd
  2023-08-31 13:58   ` Andy Shevchenko
                     ` (2 more replies)
  2023-08-31  1:14 ` [PATCH 3/3] platform/x86: intel_scu_ipc: Fail IPC send if still busy Stephen Boyd
  2023-08-31  3:28 ` [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Kuppuswamy Sathyanarayanan
  3 siblings, 3 replies; 22+ messages in thread
From: Stephen Boyd @ 2023-08-31  1:14 UTC (permalink / raw)
  To: Mika Westerberg, Hans de Goede, Mark Gross
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Kuppuswamy Sathyanarayanan, Prashant Malani

It's possible for the completion in ipc_wait_for_interrupt() to timeout,
simply because the interrupt was delayed in being processed. A timeout
in itself is not an error. This driver should check the status register
upon a timeout to ensure that scheduling or interrupt processing delays
don't affect the outcome of the IPC return value.

 CPU0                                                   SCU
 ----                                                   ---
 ipc_wait_for_interrupt()
  wait_for_completion_timeout(&scu->cmd_complete)
  [TIMEOUT]                                             status[IPC_BUSY]=0

Fix this problem by reading the status bit in all cases, regardless of
the timeout. If the completion times out, we'll assume the problem was
that the IPC_BUSY bit was still set, but if the status bit is cleared in
the meantime we know that we hit some scheduling delay and we should
just check the error bit.

Cc: Prashant Malani <pmalani@chromium.org>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Fixes: ed12f295bfd5 ("ipc: Added support for IPC interrupt mode")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
---
 drivers/platform/x86/intel_scu_ipc.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c
index 5a37becc65aa..2a21153e3bf3 100644
--- a/drivers/platform/x86/intel_scu_ipc.c
+++ b/drivers/platform/x86/intel_scu_ipc.c
@@ -246,16 +246,19 @@ static inline int busy_loop(struct intel_scu_ipc_dev *scu)
 /* Wait till ipc ioc interrupt is received or timeout in 10 HZ */
 static inline int ipc_wait_for_interrupt(struct intel_scu_ipc_dev *scu)
 {
-	int status;
+	unsigned long time_left;
+	u8 status;
+	int err = 0;
 
-	if (!wait_for_completion_timeout(&scu->cmd_complete, IPC_TIMEOUT))
-		return -ETIMEDOUT;
+	time_left = wait_for_completion_timeout(&scu->cmd_complete, IPC_TIMEOUT);
+	if (!time_left)
+		err = -ETIMEDOUT;
 
 	status = ipc_read_status(scu);
-	if (status & IPC_STATUS_ERR)
-		return -EIO;
+	if (!(status & IPC_STATUS_BUSY))
+		err = (status & IPC_STATUS_ERR) ? -EIO : 0;
 
-	return 0;
+	return err;
 }
 
 static int intel_scu_ipc_check_status(struct intel_scu_ipc_dev *scu)
-- 
https://chromeos.dev


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] platform/x86: intel_scu_ipc: Fail IPC send if still busy
  2023-08-31  1:14 [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Stephen Boyd
  2023-08-31  1:14 ` [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop() Stephen Boyd
  2023-08-31  1:14 ` [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt() Stephen Boyd
@ 2023-08-31  1:14 ` Stephen Boyd
  2023-08-31 14:07   ` Andy Shevchenko
  2023-08-31  3:28 ` [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Kuppuswamy Sathyanarayanan
  3 siblings, 1 reply; 22+ messages in thread
From: Stephen Boyd @ 2023-08-31  1:14 UTC (permalink / raw)
  To: Mika Westerberg, Hans de Goede, Mark Gross
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Kuppuswamy Sathyanarayanan, Prashant Malani

It's possible for interrupts to get significantly delayed to the point
that callers of intel_scu_ipc_dev_command() and friends can call the
function once, hit a timeout, and call it again while the interrupt
still hasn't been processed. This driver will get seriously confused if
the interrupt is finally processed after the second IPC has been sent
with ipc_command(). It won't know which IPC has been completed. This
could be quite disastrous if calling code assumes something has happened
upon return from intel_scu_ipc_dev_simple_command() when it actually
hasn't.

Let's avoid this scenario by simply returning -EBUSY in this case.
Hopefully higher layers will know to back off or fail gracefully when
this happens. It's all highly unlikely anyway, but it's better to be
correct here as we have no way to know which IPC the status register is
telling us about if we send a second IPC while the previous IPC is still
processing.

Cc: Prashant Malani <pmalani@chromium.org>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Fixes: ed12f295bfd5 ("ipc: Added support for IPC interrupt mode")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
---
 drivers/platform/x86/intel_scu_ipc.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c
index 2a21153e3bf3..5a56be319f0e 100644
--- a/drivers/platform/x86/intel_scu_ipc.c
+++ b/drivers/platform/x86/intel_scu_ipc.c
@@ -266,6 +266,19 @@ static int intel_scu_ipc_check_status(struct intel_scu_ipc_dev *scu)
 	return scu->irq > 0 ? ipc_wait_for_interrupt(scu) : busy_loop(scu);
 }
 
+static bool intel_scu_ipc_busy(struct intel_scu_ipc_dev *scu)
+{
+	u8 status;
+
+	status = ipc_read_status(scu);
+	if (status & IPC_STATUS_BUSY) {
+		dev_err(&scu->dev, "device is busy\n");
+		return true;
+	}
+
+	return false;
+}
+
 /* Read/Write power control(PMIC in Langwell, MSIC in PenWell) registers */
 static int pwr_reg_rdwr(struct intel_scu_ipc_dev *scu, u16 *addr, u8 *data,
 			u32 count, u32 op, u32 id)
@@ -285,6 +298,10 @@ static int pwr_reg_rdwr(struct intel_scu_ipc_dev *scu, u16 *addr, u8 *data,
 		mutex_unlock(&ipclock);
 		return -ENODEV;
 	}
+	if (intel_scu_ipc_busy(scu)) {
+		mutex_unlock(&ipclock);
+		return -EBUSY;
+	}
 
 	for (nc = 0; nc < count; nc++, offset += 2) {
 		cbuf[offset] = addr[nc];
@@ -445,6 +462,10 @@ int intel_scu_ipc_dev_simple_command(struct intel_scu_ipc_dev *scu, int cmd,
 		return -ENODEV;
 	}
 	scu = ipcdev;
+	if (intel_scu_ipc_busy(scu)) {
+		mutex_unlock(&ipclock);
+		return -EBUSY;
+	}
 	cmdval = sub << 12 | cmd;
 	ipc_command(scu, cmdval);
 	err = intel_scu_ipc_check_status(scu);
@@ -490,6 +511,10 @@ int intel_scu_ipc_dev_command_with_size(struct intel_scu_ipc_dev *scu, int cmd,
 		mutex_unlock(&ipclock);
 		return -ENODEV;
 	}
+	if (intel_scu_ipc_busy(scu)) {
+		mutex_unlock(&ipclock);
+		return -EBUSY;
+	}
 
 	memcpy(inbuf, in, inlen);
 	for (i = 0; i < inbuflen; i++)
-- 
https://chromeos.dev


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes
  2023-08-31  1:14 [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Stephen Boyd
                   ` (2 preceding siblings ...)
  2023-08-31  1:14 ` [PATCH 3/3] platform/x86: intel_scu_ipc: Fail IPC send if still busy Stephen Boyd
@ 2023-08-31  3:28 ` Kuppuswamy Sathyanarayanan
  2023-09-05 22:55   ` Stephen Boyd
  3 siblings, 1 reply; 22+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-08-31  3:28 UTC (permalink / raw)
  To: Stephen Boyd, Mika Westerberg, Hans de Goede, Mark Gross
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Prashant Malani



On 8/30/2023 6:14 PM, Stephen Boyd wrote:
> I recently looked at some crash reports on ChromeOS devices that call
> into this intel_scu_ipc driver. They were hitting timeouts, and it
> certainly looks possible for those timeouts to be triggering because of
> scheduling issues. Once things started going south, the timeouts kept

Are you talking about timeouts during IPC command?

> coming. Maybe that's because the other side got seriously confused? I
> don't know. I'll poke at it some more by injecting timeouts on the
> kernel side.

Do you think it is possible due to a firmware issue?

> 
> The first two patches are only lightly tested (normal functions keep
> working), while the third one is purely speculation. I was going to make
> the interrupt delay for a long time to see if I could hit the timeout.
> 
> Stephen Boyd (3):
>   platform/x86: intel_scu_ipc: Check status after timeouts in
>     busy_loop()
>   platform/x86: intel_scu_ipc: Check status upon timeout in
>     ipc_wait_for_interrupt()
>   platform/x86: intel_scu_ipc: Fail IPC send if still busy
> 
>  drivers/platform/x86/intel_scu_ipc.c | 59 ++++++++++++++++++++--------
>  1 file changed, 42 insertions(+), 17 deletions(-)
> 
> 
> base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-08-31  1:14 ` [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop() Stephen Boyd
@ 2023-08-31 13:53   ` Andy Shevchenko
  2023-09-05 22:24     ` Stephen Boyd
  2023-08-31 14:15   ` Kuppuswamy Sathyanarayanan
  2023-09-01  5:50   ` Mika Westerberg
  2 siblings, 1 reply; 22+ messages in thread
From: Andy Shevchenko @ 2023-08-31 13:53 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Mika Westerberg, Hans de Goede, Mark Gross, linux-kernel,
	patches, platform-driver-x86, Kuppuswamy Sathyanarayanan,
	Prashant Malani

On Wed, Aug 30, 2023 at 06:14:01PM -0700, Stephen Boyd wrote:
> It's possible for the polling loop in busy_loop() to get scheduled away
> for a long time.
> 
>   status = ipc_read_status(scu);
>   <long time scheduled away>
>   if (!(status & IPC_STATUS_BUSY))
> 
> If this happens, then the status bit could change and this function
> would never test it again after checking the jiffies against the timeout
> limit. Polling code should check the condition one more time after the
> timeout in case this happens.
> 
> The read_poll_timeout() helper implements this logic, and is shorter, so
> simply use that helper here.

I don't remember by heart, but on some older Intel hardware this might have
been called during early stages where ktime() is not functional yet.

Is this still a case here?

...

Codewise change looks good to me.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
  2023-08-31  1:14 ` [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt() Stephen Boyd
@ 2023-08-31 13:58   ` Andy Shevchenko
  2023-09-05 22:36     ` Stephen Boyd
  2023-08-31 14:27   ` Kuppuswamy Sathyanarayanan
  2023-09-01  6:04   ` Mika Westerberg
  2 siblings, 1 reply; 22+ messages in thread
From: Andy Shevchenko @ 2023-08-31 13:58 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Mika Westerberg, Hans de Goede, Mark Gross, linux-kernel,
	patches, platform-driver-x86, Kuppuswamy Sathyanarayanan,
	Prashant Malani

On Wed, Aug 30, 2023 at 06:14:02PM -0700, Stephen Boyd wrote:
> It's possible for the completion in ipc_wait_for_interrupt() to timeout,
> simply because the interrupt was delayed in being processed. A timeout
> in itself is not an error. This driver should check the status register
> upon a timeout to ensure that scheduling or interrupt processing delays
> don't affect the outcome of the IPC return value.
> 
>  CPU0                                                   SCU
>  ----                                                   ---
>  ipc_wait_for_interrupt()
>   wait_for_completion_timeout(&scu->cmd_complete)
>   [TIMEOUT]                                             status[IPC_BUSY]=0
> 
> Fix this problem by reading the status bit in all cases, regardless of
> the timeout. If the completion times out, we'll assume the problem was
> that the IPC_BUSY bit was still set, but if the status bit is cleared in
> the meantime we know that we hit some scheduling delay and we should
> just check the error bit.

Makes sense, thanks for fixing this!
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Also see below.

...

>  /* Wait till ipc ioc interrupt is received or timeout in 10 HZ */

Not sure if this comment needs to be updated / amended.

...

>  	status = ipc_read_status(scu);
> -	if (status & IPC_STATUS_ERR)
> -		return -EIO;
> +	if (!(status & IPC_STATUS_BUSY))
> +		err = (status & IPC_STATUS_ERR) ? -EIO : 0;
>  
> -	return 0;
> +	return err;

I would write it as:

	status = ipc_read_status(scu);
	if (status & IPC_STATUS_BUSY)
		return 0;
	if (status & IPC_STATUS_ERR)
		return -EIO;

	return 0;

Also would be good, in case you are not doing it yet, to use --patience when
formatting your patches.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] platform/x86: intel_scu_ipc: Fail IPC send if still busy
  2023-08-31  1:14 ` [PATCH 3/3] platform/x86: intel_scu_ipc: Fail IPC send if still busy Stephen Boyd
@ 2023-08-31 14:07   ` Andy Shevchenko
  2023-09-01  6:06     ` Mika Westerberg
  0 siblings, 1 reply; 22+ messages in thread
From: Andy Shevchenko @ 2023-08-31 14:07 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Mika Westerberg, Hans de Goede, Mark Gross, linux-kernel,
	patches, platform-driver-x86, Kuppuswamy Sathyanarayanan,
	Prashant Malani

On Wed, Aug 30, 2023 at 06:14:03PM -0700, Stephen Boyd wrote:
> It's possible for interrupts to get significantly delayed to the point
> that callers of intel_scu_ipc_dev_command() and friends can call the
> function once, hit a timeout, and call it again while the interrupt
> still hasn't been processed. This driver will get seriously confused if
> the interrupt is finally processed after the second IPC has been sent
> with ipc_command(). It won't know which IPC has been completed. This
> could be quite disastrous if calling code assumes something has happened
> upon return from intel_scu_ipc_dev_simple_command() when it actually
> hasn't.
> 
> Let's avoid this scenario by simply returning -EBUSY in this case.
> Hopefully higher layers will know to back off or fail gracefully when
> this happens. It's all highly unlikely anyway, but it's better to be
> correct here as we have no way to know which IPC the status register is
> telling us about if we send a second IPC while the previous IPC is still
> processing.

> +static bool intel_scu_ipc_busy(struct intel_scu_ipc_dev *scu)

static int ?

> +{
> +	u8 status;
> +
> +	status = ipc_read_status(scu);
> +	if (status & IPC_STATUS_BUSY) {

> +		dev_err(&scu->dev, "device is busy\n");

1. Wouldn't it exaggerate the logs? Shouldn't be rate limited?
2. OTOH if we return -EBUSY directly from here, do we need this at all?

> +		return true;
> +	}
> +
> +	return false;
> +}

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-08-31  1:14 ` [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop() Stephen Boyd
  2023-08-31 13:53   ` Andy Shevchenko
@ 2023-08-31 14:15   ` Kuppuswamy Sathyanarayanan
  2023-09-01  5:50   ` Mika Westerberg
  2 siblings, 0 replies; 22+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-08-31 14:15 UTC (permalink / raw)
  To: Stephen Boyd, Mika Westerberg, Hans de Goede, Mark Gross
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Prashant Malani



On 8/30/2023 6:14 PM, Stephen Boyd wrote:
> It's possible for the polling loop in busy_loop() to get scheduled away
> for a long time.
> 
>   status = ipc_read_status(scu);
>   <long time scheduled away>
>   if (!(status & IPC_STATUS_BUSY))
> 
> If this happens, then the status bit could change and this function
> would never test it again after checking the jiffies against the timeout
> limit. Polling code should check the condition one more time after the
> timeout in case this happens.

I think it is not checking the condition, but reading the status one more
time to reflect the latest status.

> 
> The read_poll_timeout() helper implements this logic, and is shorter, so
> simply use that helper here.
> 
> Cc: Prashant Malani <pmalani@chromium.org>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Fixes: e7b7ab3847c9 ("platform/x86: intel_scu_ipc: Sleeping is fine when polling")
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> ---

Otherwise code looks fine to me.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>


>  drivers/platform/x86/intel_scu_ipc.c | 19 ++++++++-----------
>  1 file changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c
> index 6851d10d6582..5a37becc65aa 100644
> --- a/drivers/platform/x86/intel_scu_ipc.c
> +++ b/drivers/platform/x86/intel_scu_ipc.c
> @@ -19,6 +19,7 @@
>  #include <linux/init.h>
>  #include <linux/interrupt.h>
>  #include <linux/io.h>
> +#include <linux/iopoll.h>
>  #include <linux/module.h>
>  #include <linux/slab.h>
>  
> @@ -231,19 +232,15 @@ static inline u32 ipc_data_readl(struct intel_scu_ipc_dev *scu, u32 offset)
>  /* Wait till scu status is busy */
>  static inline int busy_loop(struct intel_scu_ipc_dev *scu)
>  {
> -	unsigned long end = jiffies + IPC_TIMEOUT;
> -
> -	do {
> -		u32 status;
> -
> -		status = ipc_read_status(scu);
> -		if (!(status & IPC_STATUS_BUSY))
> -			return (status & IPC_STATUS_ERR) ? -EIO : 0;
> +	u8 status;
> +	int err;
>  
> -		usleep_range(50, 100);
> -	} while (time_before(jiffies, end));
> +	err = read_poll_timeout(ipc_read_status, status, !(status & IPC_STATUS_BUSY),
> +				100, jiffies_to_usecs(IPC_TIMEOUT), false, scu);
> +	if (err)
> +		return err;
>  
> -	return -ETIMEDOUT;
> +	return (status & IPC_STATUS_ERR) ? -EIO : 0;
>  }
>  
>  /* Wait till ipc ioc interrupt is received or timeout in 10 HZ */

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
  2023-08-31  1:14 ` [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt() Stephen Boyd
  2023-08-31 13:58   ` Andy Shevchenko
@ 2023-08-31 14:27   ` Kuppuswamy Sathyanarayanan
  2023-09-05 22:56     ` Stephen Boyd
  2023-09-01  6:04   ` Mika Westerberg
  2 siblings, 1 reply; 22+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-08-31 14:27 UTC (permalink / raw)
  To: Stephen Boyd, Mika Westerberg, Hans de Goede, Mark Gross
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Prashant Malani



On 8/30/2023 6:14 PM, Stephen Boyd wrote:
> It's possible for the completion in ipc_wait_for_interrupt() to timeout,
> simply because the interrupt was delayed in being processed. A timeout
> in itself is not an error. This driver should check the status register
> upon a timeout to ensure that scheduling or interrupt processing delays
> don't affect the outcome of the IPC return value.
> 
>  CPU0                                                   SCU
>  ----                                                   ---
>  ipc_wait_for_interrupt()
>   wait_for_completion_timeout(&scu->cmd_complete)
>   [TIMEOUT]                                             status[IPC_BUSY]=0
> 
> Fix this problem by reading the status bit in all cases, regardless of
> the timeout. If the completion times out, we'll assume the problem was
> that the IPC_BUSY bit was still set, but if the status bit is cleared in
> the meantime we know that we hit some scheduling delay and we should
> just check the error bit.
> 
> Cc: Prashant Malani <pmalani@chromium.org>
> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Fixes: ed12f295bfd5 ("ipc: Added support for IPC interrupt mode")
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> ---
>  drivers/platform/x86/intel_scu_ipc.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c
> index 5a37becc65aa..2a21153e3bf3 100644
> --- a/drivers/platform/x86/intel_scu_ipc.c
> +++ b/drivers/platform/x86/intel_scu_ipc.c
> @@ -246,16 +246,19 @@ static inline int busy_loop(struct intel_scu_ipc_dev *scu)
>  /* Wait till ipc ioc interrupt is received or timeout in 10 HZ */
>  static inline int ipc_wait_for_interrupt(struct intel_scu_ipc_dev *scu)
>  {
> -	int status;
> +	unsigned long time_left;
> +	u8 status;
> +	int err = 0;
>  
> -	if (!wait_for_completion_timeout(&scu->cmd_complete, IPC_TIMEOUT))
> -		return -ETIMEDOUT;
> +	time_left = wait_for_completion_timeout(&scu->cmd_complete, IPC_TIMEOUT);
> +	if (!time_left)
> +		err = -ETIMEDOUT;

Since you are not using the return value, I would not use time_left. I think the
following version is easy to read. But it is up to you.

wait_for_completion_timeout(&scu->cmd_complete, IPC_TIMEOUT)

status = ipc_read_status(scu);

if (status & IPC_STATUS_BUSY)
   return -ETIMEDOUT;

if (status & IPC_STATUS_ERR)
   return -EIO;

return 0;

With above fixed, you can add

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>


>  
>  	status = ipc_read_status(scu);
> -	if (status & IPC_STATUS_ERR)
> -		return -EIO;
> +	if (!(status & IPC_STATUS_BUSY))
> +		err = (status & IPC_STATUS_ERR) ? -EIO : 0;
>  
> -	return 0;
> +	return err;
>  }
>  
>  static int intel_scu_ipc_check_status(struct intel_scu_ipc_dev *scu)

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-08-31  1:14 ` [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop() Stephen Boyd
  2023-08-31 13:53   ` Andy Shevchenko
  2023-08-31 14:15   ` Kuppuswamy Sathyanarayanan
@ 2023-09-01  5:50   ` Mika Westerberg
  2023-09-05 22:27     ` Stephen Boyd
  2 siblings, 1 reply; 22+ messages in thread
From: Mika Westerberg @ 2023-09-01  5:50 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Hans de Goede, Mark Gross, linux-kernel, patches,
	platform-driver-x86, Andy Shevchenko, Kuppuswamy Sathyanarayanan,
	Prashant Malani

On Wed, Aug 30, 2023 at 06:14:01PM -0700, Stephen Boyd wrote:
> It's possible for the polling loop in busy_loop() to get scheduled away
> for a long time.
> 
>   status = ipc_read_status(scu);
>   <long time scheduled away>
>   if (!(status & IPC_STATUS_BUSY))

How can the status bit change here as we are the only user and the SCU
access is serialized by ipclock?

> If this happens, then the status bit could change and this function
> would never test it again after checking the jiffies against the timeout
> limit. Polling code should check the condition one more time after the
> timeout in case this happens.
> 
> The read_poll_timeout() helper implements this logic, and is shorter, so
> simply use that helper here.

Yes, I agree it makes the code simpler.

> Cc: Prashant Malani <pmalani@chromium.org>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Fixes: e7b7ab3847c9 ("platform/x86: intel_scu_ipc: Sleeping is fine when polling")
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> ---
>  drivers/platform/x86/intel_scu_ipc.c | 19 ++++++++-----------
>  1 file changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c
> index 6851d10d6582..5a37becc65aa 100644
> --- a/drivers/platform/x86/intel_scu_ipc.c
> +++ b/drivers/platform/x86/intel_scu_ipc.c
> @@ -19,6 +19,7 @@
>  #include <linux/init.h>
>  #include <linux/interrupt.h>
>  #include <linux/io.h>
> +#include <linux/iopoll.h>
>  #include <linux/module.h>
>  #include <linux/slab.h>
>  
> @@ -231,19 +232,15 @@ static inline u32 ipc_data_readl(struct intel_scu_ipc_dev *scu, u32 offset)
>  /* Wait till scu status is busy */
>  static inline int busy_loop(struct intel_scu_ipc_dev *scu)
>  {
> -	unsigned long end = jiffies + IPC_TIMEOUT;
> -
> -	do {
> -		u32 status;
> -
> -		status = ipc_read_status(scu);
> -		if (!(status & IPC_STATUS_BUSY))
> -			return (status & IPC_STATUS_ERR) ? -EIO : 0;
> +	u8 status;
> +	int err;
>  
> -		usleep_range(50, 100);
> -	} while (time_before(jiffies, end));
> +	err = read_poll_timeout(ipc_read_status, status, !(status & IPC_STATUS_BUSY),
> +				100, jiffies_to_usecs(IPC_TIMEOUT), false, scu);
> +	if (err)
> +		return err;
>  
> -	return -ETIMEDOUT;
> +	return (status & IPC_STATUS_ERR) ? -EIO : 0;
>  }
>  
>  /* Wait till ipc ioc interrupt is received or timeout in 10 HZ */
> -- 
> https://chromeos.dev

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
  2023-08-31  1:14 ` [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt() Stephen Boyd
  2023-08-31 13:58   ` Andy Shevchenko
  2023-08-31 14:27   ` Kuppuswamy Sathyanarayanan
@ 2023-09-01  6:04   ` Mika Westerberg
  2 siblings, 0 replies; 22+ messages in thread
From: Mika Westerberg @ 2023-09-01  6:04 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Hans de Goede, Mark Gross, linux-kernel, patches,
	platform-driver-x86, Andy Shevchenko, Kuppuswamy Sathyanarayanan,
	Prashant Malani

On Wed, Aug 30, 2023 at 06:14:02PM -0700, Stephen Boyd wrote:
> It's possible for the completion in ipc_wait_for_interrupt() to timeout,
> simply because the interrupt was delayed in being processed. A timeout
> in itself is not an error. This driver should check the status register
> upon a timeout to ensure that scheduling or interrupt processing delays
> don't affect the outcome of the IPC return value.
> 
>  CPU0                                                   SCU
>  ----                                                   ---
>  ipc_wait_for_interrupt()
>   wait_for_completion_timeout(&scu->cmd_complete)
>   [TIMEOUT]                                             status[IPC_BUSY]=0
> 
> Fix this problem by reading the status bit in all cases, regardless of
> the timeout. If the completion times out, we'll assume the problem was
> that the IPC_BUSY bit was still set, but if the status bit is cleared in
> the meantime we know that we hit some scheduling delay and we should
> just check the error bit.
> 
> Cc: Prashant Malani <pmalani@chromium.org>
> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] platform/x86: intel_scu_ipc: Fail IPC send if still busy
  2023-08-31 14:07   ` Andy Shevchenko
@ 2023-09-01  6:06     ` Mika Westerberg
  2023-09-05 22:42       ` Stephen Boyd
  0 siblings, 1 reply; 22+ messages in thread
From: Mika Westerberg @ 2023-09-01  6:06 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Stephen Boyd, Hans de Goede, Mark Gross, linux-kernel, patches,
	platform-driver-x86, Kuppuswamy Sathyanarayanan, Prashant Malani

On Thu, Aug 31, 2023 at 05:07:26PM +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 06:14:03PM -0700, Stephen Boyd wrote:
> > It's possible for interrupts to get significantly delayed to the point
> > that callers of intel_scu_ipc_dev_command() and friends can call the
> > function once, hit a timeout, and call it again while the interrupt
> > still hasn't been processed. This driver will get seriously confused if
> > the interrupt is finally processed after the second IPC has been sent
> > with ipc_command(). It won't know which IPC has been completed. This
> > could be quite disastrous if calling code assumes something has happened
> > upon return from intel_scu_ipc_dev_simple_command() when it actually
> > hasn't.
> > 
> > Let's avoid this scenario by simply returning -EBUSY in this case.
> > Hopefully higher layers will know to back off or fail gracefully when
> > this happens. It's all highly unlikely anyway, but it's better to be
> > correct here as we have no way to know which IPC the status register is
> > telling us about if we send a second IPC while the previous IPC is still
> > processing.
> 
> > +static bool intel_scu_ipc_busy(struct intel_scu_ipc_dev *scu)
> 
> static int ?
> 
> > +{
> > +	u8 status;
> > +
> > +	status = ipc_read_status(scu);
> > +	if (status & IPC_STATUS_BUSY) {
> 
> > +		dev_err(&scu->dev, "device is busy\n");
> 
> 1. Wouldn't it exaggerate the logs? Shouldn't be rate limited?
> 2. OTOH if we return -EBUSY directly from here, do we need this at all?

Agree w/ returning -EBUSY here and dropping the dev_err() (or using
dev_dbg()).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-08-31 13:53   ` Andy Shevchenko
@ 2023-09-05 22:24     ` Stephen Boyd
  2023-09-06 13:58       ` Andy Shevchenko
  0 siblings, 1 reply; 22+ messages in thread
From: Stephen Boyd @ 2023-09-05 22:24 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Mika Westerberg, Hans de Goede, Mark Gross, linux-kernel,
	patches, platform-driver-x86, Kuppuswamy Sathyanarayanan,
	Prashant Malani

Quoting Andy Shevchenko (2023-08-31 06:53:14)
> On Wed, Aug 30, 2023 at 06:14:01PM -0700, Stephen Boyd wrote:
> > It's possible for the polling loop in busy_loop() to get scheduled away
> > for a long time.
> >
> >   status = ipc_read_status(scu);
> >   <long time scheduled away>
> >   if (!(status & IPC_STATUS_BUSY))
> >
> > If this happens, then the status bit could change and this function
> > would never test it again after checking the jiffies against the timeout
> > limit. Polling code should check the condition one more time after the
> > timeout in case this happens.
> >
> > The read_poll_timeout() helper implements this logic, and is shorter, so
> > simply use that helper here.
>
> I don't remember by heart, but on some older Intel hardware this might have
> been called during early stages where ktime() is not functional yet.
>
> Is this still a case here?

I have no idea if that happens in early stages. What about
suspend/resume though? I suppose timekeeping could be suspended in that
case, so we can't really check anything with ktime.

I can rework this patch to simply recheck the busy bit so that we don't
have to figure out if the code is called early or from suspend paths.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-09-01  5:50   ` Mika Westerberg
@ 2023-09-05 22:27     ` Stephen Boyd
  2023-09-06 13:46       ` Andy Shevchenko
  0 siblings, 1 reply; 22+ messages in thread
From: Stephen Boyd @ 2023-09-05 22:27 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Hans de Goede, Mark Gross, linux-kernel, patches,
	platform-driver-x86, Andy Shevchenko, Kuppuswamy Sathyanarayanan,
	Prashant Malani

Quoting Mika Westerberg (2023-08-31 22:50:11)
> On Wed, Aug 30, 2023 at 06:14:01PM -0700, Stephen Boyd wrote:
> > It's possible for the polling loop in busy_loop() to get scheduled away
> > for a long time.
> >
> >   status = ipc_read_status(scu);
> >   <long time scheduled away>
> >   if (!(status & IPC_STATUS_BUSY))
>
> How can the status bit change here as we are the only user and the SCU
> access is serialized by ipclock?

I don't know how the SCU works. I thought that IPC_STATUS_BUSY bit was
cleared by the SCU when it was done processing. With that assumption, I
tried to show that the status is read and then the process schedules
away for a long time and has an outdated view of the busy bit.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
  2023-08-31 13:58   ` Andy Shevchenko
@ 2023-09-05 22:36     ` Stephen Boyd
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Boyd @ 2023-09-05 22:36 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Mika Westerberg, Hans de Goede, Mark Gross, linux-kernel,
	patches, platform-driver-x86, Kuppuswamy Sathyanarayanan,
	Prashant Malani

Quoting Andy Shevchenko (2023-08-31 06:58:18)
> On Wed, Aug 30, 2023 at 06:14:02PM -0700, Stephen Boyd wrote:
> > It's possible for the completion in ipc_wait_for_interrupt() to timeout,
> > simply because the interrupt was delayed in being processed. A timeout
> > in itself is not an error. This driver should check the status register
> > upon a timeout to ensure that scheduling or interrupt processing delays
> > don't affect the outcome of the IPC return value.
> >
> >  CPU0                                                   SCU
> >  ----                                                   ---
> >  ipc_wait_for_interrupt()
> >   wait_for_completion_timeout(&scu->cmd_complete)
> >   [TIMEOUT]                                             status[IPC_BUSY]=0
> >
> > Fix this problem by reading the status bit in all cases, regardless of
> > the timeout. If the completion times out, we'll assume the problem was
> > that the IPC_BUSY bit was still set, but if the status bit is cleared in
> > the meantime we know that we hit some scheduling delay and we should
> > just check the error bit.
>
> Makes sense, thanks for fixing this!
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
>
> Also see below.
>
> ...
>
> >  /* Wait till ipc ioc interrupt is received or timeout in 10 HZ */
>
> Not sure if this comment needs to be updated / amended.

Or removed?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] platform/x86: intel_scu_ipc: Fail IPC send if still busy
  2023-09-01  6:06     ` Mika Westerberg
@ 2023-09-05 22:42       ` Stephen Boyd
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Boyd @ 2023-09-05 22:42 UTC (permalink / raw)
  To: Andy Shevchenko, Mika Westerberg
  Cc: Hans de Goede, Mark Gross, linux-kernel, patches,
	platform-driver-x86, Kuppuswamy Sathyanarayanan, Prashant Malani

Quoting Mika Westerberg (2023-08-31 23:06:33)
> On Thu, Aug 31, 2023 at 05:07:26PM +0300, Andy Shevchenko wrote:
> > On Wed, Aug 30, 2023 at 06:14:03PM -0700, Stephen Boyd wrote:
> > > It's possible for interrupts to get significantly delayed to the point
> > > that callers of intel_scu_ipc_dev_command() and friends can call the
> > > function once, hit a timeout, and call it again while the interrupt
> > > still hasn't been processed. This driver will get seriously confused if
> > > the interrupt is finally processed after the second IPC has been sent
> > > with ipc_command(). It won't know which IPC has been completed. This
> > > could be quite disastrous if calling code assumes something has happened
> > > upon return from intel_scu_ipc_dev_simple_command() when it actually
> > > hasn't.
> > >
> > > Let's avoid this scenario by simply returning -EBUSY in this case.
> > > Hopefully higher layers will know to back off or fail gracefully when
> > > this happens. It's all highly unlikely anyway, but it's better to be
> > > correct here as we have no way to know which IPC the status register is
> > > telling us about if we send a second IPC while the previous IPC is still
> > > processing.
> >
> > > +static bool intel_scu_ipc_busy(struct intel_scu_ipc_dev *scu)
> >
> > static int ?
> >
> > > +{
> > > +   u8 status;
> > > +
> > > +   status = ipc_read_status(scu);
> > > +   if (status & IPC_STATUS_BUSY) {
> >
> > > +           dev_err(&scu->dev, "device is busy\n");
> >
> > 1. Wouldn't it exaggerate the logs? Shouldn't be rate limited?
> > 2. OTOH if we return -EBUSY directly from here, do we need this at all?
>
> Agree w/ returning -EBUSY here and dropping the dev_err() (or using
> dev_dbg()).

Ok. I'll change to dev_dbg(). I assume that this should never happen,
but you never know if some calling code will ignore the return -EBUSY
from the previous round and call again while the previous IPC is
processing.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes
  2023-08-31  3:28 ` [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Kuppuswamy Sathyanarayanan
@ 2023-09-05 22:55   ` Stephen Boyd
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Boyd @ 2023-09-05 22:55 UTC (permalink / raw)
  To: Hans de Goede, Kuppuswamy Sathyanarayanan, Mark Gross, Mika Westerberg
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Prashant Malani

Quoting Kuppuswamy Sathyanarayanan (2023-08-30 20:28:57)
>
>
> On 8/30/2023 6:14 PM, Stephen Boyd wrote:
> > I recently looked at some crash reports on ChromeOS devices that call
> > into this intel_scu_ipc driver. They were hitting timeouts, and it
> > certainly looks possible for those timeouts to be triggering because of
> > scheduling issues. Once things started going south, the timeouts kept
>
> Are you talking about timeouts during IPC command?

Yes? I see messages like this

	intel_scu_ipc intel_scu_ipc: IPC command 0x200a7 failed with -110

which led me to this driver and I wrote these patches based on that
failure message. I was trying to figure out how that could happen, and
it seems that it could simply be scheduling delays while nothing is
really timing out.

>
> > coming. Maybe that's because the other side got seriously confused? I
> > don't know. I'll poke at it some more by injecting timeouts on the
> > kernel side.
>
> Do you think it is possible due to a firmware issue?

I have no idea. Is there some way to figure that out? I'm not able to
reproduce the problem locally.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
  2023-08-31 14:27   ` Kuppuswamy Sathyanarayanan
@ 2023-09-05 22:56     ` Stephen Boyd
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Boyd @ 2023-09-05 22:56 UTC (permalink / raw)
  To: Hans de Goede, Kuppuswamy Sathyanarayanan, Mark Gross, Mika Westerberg
  Cc: linux-kernel, patches, platform-driver-x86, Andy Shevchenko,
	Prashant Malani

Quoting Kuppuswamy Sathyanarayanan (2023-08-31 07:27:35)
>
>
> Since you are not using the return value, I would not use time_left. I think the
> following version is easy to read. But it is up to you.
>
> wait_for_completion_timeout(&scu->cmd_complete, IPC_TIMEOUT)
>
> status = ipc_read_status(scu);
>
> if (status & IPC_STATUS_BUSY)
>    return -ETIMEDOUT;
>
> if (status & IPC_STATUS_ERR)
>    return -EIO;
>
> return 0;
>
> With above fixed, you can add

Thanks! That's a lot better.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-09-05 22:27     ` Stephen Boyd
@ 2023-09-06 13:46       ` Andy Shevchenko
  2023-09-06 14:31         ` Mika Westerberg
  0 siblings, 1 reply; 22+ messages in thread
From: Andy Shevchenko @ 2023-09-06 13:46 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Mika Westerberg, Hans de Goede, Mark Gross, linux-kernel,
	patches, platform-driver-x86, Kuppuswamy Sathyanarayanan,
	Prashant Malani

On Tue, Sep 05, 2023 at 05:27:23PM -0500, Stephen Boyd wrote:
> Quoting Mika Westerberg (2023-08-31 22:50:11)
> > On Wed, Aug 30, 2023 at 06:14:01PM -0700, Stephen Boyd wrote:
> > > It's possible for the polling loop in busy_loop() to get scheduled away
> > > for a long time.
> > >
> > >   status = ipc_read_status(scu);
> > >   <long time scheduled away>
> > >   if (!(status & IPC_STATUS_BUSY))
> >
> > How can the status bit change here as we are the only user and the SCU
> > access is serialized by ipclock?
> 
> I don't know how the SCU works. I thought that IPC_STATUS_BUSY bit was
> cleared by the SCU when it was done processing. With that assumption, I
> tried to show that the status is read and then the process schedules
> away for a long time and has an outdated view of the busy bit.

We probably have different versions of firmwares for the different SoC
generations. But I _think_ that you are right, the SCU firmware should
clear the bit when it's done.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-09-05 22:24     ` Stephen Boyd
@ 2023-09-06 13:58       ` Andy Shevchenko
  0 siblings, 0 replies; 22+ messages in thread
From: Andy Shevchenko @ 2023-09-06 13:58 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Mika Westerberg, Hans de Goede, Mark Gross, linux-kernel,
	patches, platform-driver-x86, Kuppuswamy Sathyanarayanan,
	Prashant Malani

On Tue, Sep 05, 2023 at 05:24:29PM -0500, Stephen Boyd wrote:
> Quoting Andy Shevchenko (2023-08-31 06:53:14)
> > On Wed, Aug 30, 2023 at 06:14:01PM -0700, Stephen Boyd wrote:
> > > It's possible for the polling loop in busy_loop() to get scheduled away
> > > for a long time.
> > >
> > >   status = ipc_read_status(scu);
> > >   <long time scheduled away>
> > >   if (!(status & IPC_STATUS_BUSY))
> > >
> > > If this happens, then the status bit could change and this function
> > > would never test it again after checking the jiffies against the timeout
> > > limit. Polling code should check the condition one more time after the
> > > timeout in case this happens.
> > >
> > > The read_poll_timeout() helper implements this logic, and is shorter, so
> > > simply use that helper here.
> >
> > I don't remember by heart, but on some older Intel hardware this might have
> > been called during early stages where ktime() is not functional yet.
> >
> > Is this still a case here?
> 
> I have no idea if that happens in early stages.

I briefly browsed the current tree and it seems it's not the case.

> What about
> suspend/resume though? I suppose timekeeping could be suspended in that
> case, so we can't really check anything with ktime.

Hmm... SCU itself is running all the time I think. The timekeeping depends on
the platform, but is it really the case? I dunno.

> I can rework this patch to simply recheck the busy bit so that we don't
> have to figure out if the code is called early or from suspend paths.

Yeah, probably we can do this and leave this nice cleanup in place.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop()
  2023-09-06 13:46       ` Andy Shevchenko
@ 2023-09-06 14:31         ` Mika Westerberg
  0 siblings, 0 replies; 22+ messages in thread
From: Mika Westerberg @ 2023-09-06 14:31 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Stephen Boyd, Hans de Goede, Mark Gross, linux-kernel, patches,
	platform-driver-x86, Kuppuswamy Sathyanarayanan, Prashant Malani

On Wed, Sep 06, 2023 at 04:46:55PM +0300, Andy Shevchenko wrote:
> On Tue, Sep 05, 2023 at 05:27:23PM -0500, Stephen Boyd wrote:
> > Quoting Mika Westerberg (2023-08-31 22:50:11)
> > > On Wed, Aug 30, 2023 at 06:14:01PM -0700, Stephen Boyd wrote:
> > > > It's possible for the polling loop in busy_loop() to get scheduled away
> > > > for a long time.
> > > >
> > > >   status = ipc_read_status(scu);
> > > >   <long time scheduled away>
> > > >   if (!(status & IPC_STATUS_BUSY))
> > >
> > > How can the status bit change here as we are the only user and the SCU
> > > access is serialized by ipclock?
> > 
> > I don't know how the SCU works. I thought that IPC_STATUS_BUSY bit was
> > cleared by the SCU when it was done processing. With that assumption, I
> > tried to show that the status is read and then the process schedules
> > away for a long time and has an outdated view of the busy bit.
> 
> We probably have different versions of firmwares for the different SoC
> generations. But I _think_ that you are right, the SCU firmware should
> clear the bit when it's done.

Yes, IIRC it does. Okay I see the (potential, although quite unlikely)
problem now. Thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-09-06 14:32 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-31  1:14 [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Stephen Boyd
2023-08-31  1:14 ` [PATCH 1/3] platform/x86: intel_scu_ipc: Check status after timeouts in busy_loop() Stephen Boyd
2023-08-31 13:53   ` Andy Shevchenko
2023-09-05 22:24     ` Stephen Boyd
2023-09-06 13:58       ` Andy Shevchenko
2023-08-31 14:15   ` Kuppuswamy Sathyanarayanan
2023-09-01  5:50   ` Mika Westerberg
2023-09-05 22:27     ` Stephen Boyd
2023-09-06 13:46       ` Andy Shevchenko
2023-09-06 14:31         ` Mika Westerberg
2023-08-31  1:14 ` [PATCH 2/3] platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt() Stephen Boyd
2023-08-31 13:58   ` Andy Shevchenko
2023-09-05 22:36     ` Stephen Boyd
2023-08-31 14:27   ` Kuppuswamy Sathyanarayanan
2023-09-05 22:56     ` Stephen Boyd
2023-09-01  6:04   ` Mika Westerberg
2023-08-31  1:14 ` [PATCH 3/3] platform/x86: intel_scu_ipc: Fail IPC send if still busy Stephen Boyd
2023-08-31 14:07   ` Andy Shevchenko
2023-09-01  6:06     ` Mika Westerberg
2023-09-05 22:42       ` Stephen Boyd
2023-08-31  3:28 ` [PATCH 0/3] platform/x86: intel_scu_ipc: Timeout fixes Kuppuswamy Sathyanarayanan
2023-09-05 22:55   ` Stephen Boyd

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.