linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
@ 2024-02-22 10:12 Théo Lebrun
  2024-02-22 10:12 ` [PATCH v4 1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks Théo Lebrun
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Théo Lebrun @ 2024-02-22 10:12 UTC (permalink / raw)
  To: Mark Brown, Apurva Nandan, Dhruva Gole
  Cc: linux-spi, linux-kernel, Gregory CLEMENT, Vladimir Kondratiev,
	Thomas Petazzoni, Tawfik Bayouk, Théo Lebrun

Hi,

This fixes runtime PM and system-wide suspend for the cadence-qspi
driver. Seeing how runtime PM and autosuspend are enabled by default, I
believe this affects all users of the driver.

This series has been tested on both Mobileye EyeQ5 hardware and the TI
J7200 EVM board, under s2idle.

Thanks all,
Théo

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
Changes in v4:
- Take Reviewed-by Dhruva Gole on patch 1/4.
- Fix struct dev_pm_ops declaration to avoid -Wunused-function warning
  when CONFIG_PM_SLEEP=n. Replace SET_*_PM_OPS() by *_PM_OPS(). See
  kernel test robot warning:
  https://lore.kernel.org/oe-kbuild-all/202402221505.712Q7MSU-lkp@intel.com/
- Link to v3: https://lore.kernel.org/r/20240209-cdns-qspi-pm-fix-v3-0-540ac222f26b@bootlin.com

Changes in v3:
- Move both bugfix patches to the start of the series.
- Remove Fixes: trailer from the function renaming patch.
- Link to v2: https://lore.kernel.org/r/20240205-cdns-qspi-pm-fix-v2-0-2e7bbad49a46@bootlin.com

Changes in v2:
- Split the initial change into three separate commits, to make intents
  clearer.
- Mark controller as suspended during the system-wide suspend.
- Link to v1: https://lore.kernel.org/r/20240202-cdns-qspi-pm-fix-v1-1-3c8feb2bfdd8@bootlin.com

---
Théo Lebrun (4):
      spi: cadence-qspi: fix pointer reference in runtime PM hooks
      spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
      spi: cadence-qspi: put runtime in runtime PM hooks names
      spi: cadence-qspi: add system-wide suspend and resume callbacks

 drivers/spi/spi-cadence-quadspi.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)
---
base-commit: 13acce918af915278e49980a3038df31845dbf39
change-id: 20240202-cdns-qspi-pm-fix-29600cc6d7bf

Best regards,
-- 
Théo Lebrun <theo.lebrun@bootlin.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks
  2024-02-22 10:12 [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Théo Lebrun
@ 2024-02-22 10:12 ` Théo Lebrun
  2024-02-22 10:12 ` [PATCH v4 2/4] spi: cadence-qspi: remove system-wide suspend helper calls from " Théo Lebrun
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Théo Lebrun @ 2024-02-22 10:12 UTC (permalink / raw)
  To: Mark Brown, Apurva Nandan, Dhruva Gole
  Cc: linux-spi, linux-kernel, Gregory CLEMENT, Vladimir Kondratiev,
	Thomas Petazzoni, Tawfik Bayouk, Théo Lebrun

dev_get_drvdata() gets used to acquire the pointer to cqspi and the SPI
controller. Neither embed the other; this lead to memory corruption.

On a given platform (Mobileye EyeQ5) the memory corruption is hidden
inside cqspi->f_pdata. Also, this uninitialised memory is used as a
mutex (ctlr->bus_lock_mutex) by spi_controller_suspend().

Fixes: 2087e85bb66e ("spi: cadence-quadspi: fix suspend-resume implementations")
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/spi/spi-cadence-quadspi.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 74647dfcb86c..d19ba024c80b 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1930,10 +1930,9 @@ static void cqspi_remove(struct platform_device *pdev)
 static int cqspi_suspend(struct device *dev)
 {
 	struct cqspi_st *cqspi = dev_get_drvdata(dev);
-	struct spi_controller *host = dev_get_drvdata(dev);
 	int ret;
 
-	ret = spi_controller_suspend(host);
+	ret = spi_controller_suspend(cqspi->host);
 	cqspi_controller_enable(cqspi, 0);
 
 	clk_disable_unprepare(cqspi->clk);
@@ -1944,7 +1943,6 @@ static int cqspi_suspend(struct device *dev)
 static int cqspi_resume(struct device *dev)
 {
 	struct cqspi_st *cqspi = dev_get_drvdata(dev);
-	struct spi_controller *host = dev_get_drvdata(dev);
 
 	clk_prepare_enable(cqspi->clk);
 	cqspi_wait_idle(cqspi);
@@ -1953,7 +1951,7 @@ static int cqspi_resume(struct device *dev)
 	cqspi->current_cs = -1;
 	cqspi->sclk = 0;
 
-	return spi_controller_resume(host);
+	return spi_controller_resume(cqspi->host);
 }
 
 static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_suspend,

-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 2/4] spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
  2024-02-22 10:12 [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Théo Lebrun
  2024-02-22 10:12 ` [PATCH v4 1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks Théo Lebrun
@ 2024-02-22 10:12 ` Théo Lebrun
  2024-02-22 10:12 ` [PATCH v4 3/4] spi: cadence-qspi: put runtime in runtime PM hooks names Théo Lebrun
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Théo Lebrun @ 2024-02-22 10:12 UTC (permalink / raw)
  To: Mark Brown, Apurva Nandan, Dhruva Gole
  Cc: linux-spi, linux-kernel, Gregory CLEMENT, Vladimir Kondratiev,
	Thomas Petazzoni, Tawfik Bayouk, Théo Lebrun

The ->runtime_suspend() and ->runtime_resume() callbacks are not
expected to call spi_controller_suspend() and spi_controller_resume().
Remove calls to those in the cadence-qspi driver.

Those helpers have two roles currently:
 - They stop/start the queue, including dealing with the kworker.
 - They toggle the SPI controller SPI_CONTROLLER_SUSPENDED flag. It
   requires acquiring ctlr->bus_lock_mutex.

Step one is irrelevant because cadence-qspi is not queued. Step two
however has two implications:
 - A deadlock occurs, because ->runtime_resume() is called in a context
   where the lock is already taken (in the ->exec_op() callback, where
   the usage count is incremented).
 - It would disallow all operations once the device is auto-suspended.

Here is a brief call tree highlighting the mutex deadlock:

spi_mem_exec_op()
        ...
        spi_mem_access_start()
                mutex_lock(&ctlr->bus_lock_mutex)

        cqspi_exec_mem_op()
                pm_runtime_resume_and_get()
                        cqspi_resume()
                                spi_controller_resume()
                                        mutex_lock(&ctlr->bus_lock_mutex)
                ...

        spi_mem_access_end()
                mutex_unlock(&ctlr->bus_lock_mutex)
        ...

Fixes: 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support")
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/spi/spi-cadence-quadspi.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index d19ba024c80b..809bbbb876ad 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1930,14 +1930,10 @@ static void cqspi_remove(struct platform_device *pdev)
 static int cqspi_suspend(struct device *dev)
 {
 	struct cqspi_st *cqspi = dev_get_drvdata(dev);
-	int ret;
 
-	ret = spi_controller_suspend(cqspi->host);
 	cqspi_controller_enable(cqspi, 0);
-
 	clk_disable_unprepare(cqspi->clk);
-
-	return ret;
+	return 0;
 }
 
 static int cqspi_resume(struct device *dev)
@@ -1950,8 +1946,7 @@ static int cqspi_resume(struct device *dev)
 
 	cqspi->current_cs = -1;
 	cqspi->sclk = 0;
-
-	return spi_controller_resume(cqspi->host);
+	return 0;
 }
 
 static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_suspend,

-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 3/4] spi: cadence-qspi: put runtime in runtime PM hooks names
  2024-02-22 10:12 [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Théo Lebrun
  2024-02-22 10:12 ` [PATCH v4 1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks Théo Lebrun
  2024-02-22 10:12 ` [PATCH v4 2/4] spi: cadence-qspi: remove system-wide suspend helper calls from " Théo Lebrun
@ 2024-02-22 10:12 ` Théo Lebrun
  2024-02-22 10:22   ` Dhruva Gole
  2024-02-22 10:12 ` [PATCH v4 4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks Théo Lebrun
  2024-02-22 19:13 ` [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Mark Brown
  4 siblings, 1 reply; 14+ messages in thread
From: Théo Lebrun @ 2024-02-22 10:12 UTC (permalink / raw)
  To: Mark Brown, Apurva Nandan, Dhruva Gole
  Cc: linux-spi, linux-kernel, Gregory CLEMENT, Vladimir Kondratiev,
	Thomas Petazzoni, Tawfik Bayouk, Théo Lebrun

Follow kernel naming convention with regards to power-management
callback function names.

The convention in the kernel is:
 - prefix_suspend means the system-wide suspend callback;
 - prefix_runtime_suspend means the runtime PM suspend callback.
The same applies to resume callbacks.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/spi/spi-cadence-quadspi.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 809bbbb876ad..ee14965142ba 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1927,7 +1927,7 @@ static void cqspi_remove(struct platform_device *pdev)
 	pm_runtime_disable(&pdev->dev);
 }
 
-static int cqspi_suspend(struct device *dev)
+static int cqspi_runtime_suspend(struct device *dev)
 {
 	struct cqspi_st *cqspi = dev_get_drvdata(dev);
 
@@ -1936,7 +1936,7 @@ static int cqspi_suspend(struct device *dev)
 	return 0;
 }
 
-static int cqspi_resume(struct device *dev)
+static int cqspi_runtime_resume(struct device *dev)
 {
 	struct cqspi_st *cqspi = dev_get_drvdata(dev);
 
@@ -1949,8 +1949,8 @@ static int cqspi_resume(struct device *dev)
 	return 0;
 }
 
-static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_suspend,
-				 cqspi_resume, NULL);
+static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_runtime_suspend,
+				 cqspi_runtime_resume, NULL);
 
 static const struct cqspi_driver_platdata cdns_qspi = {
 	.quirks = CQSPI_DISABLE_DAC_MODE,

-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks
  2024-02-22 10:12 [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Théo Lebrun
                   ` (2 preceding siblings ...)
  2024-02-22 10:12 ` [PATCH v4 3/4] spi: cadence-qspi: put runtime in runtime PM hooks names Théo Lebrun
@ 2024-02-22 10:12 ` Théo Lebrun
  2024-02-22 19:13 ` [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Mark Brown
  4 siblings, 0 replies; 14+ messages in thread
From: Théo Lebrun @ 2024-02-22 10:12 UTC (permalink / raw)
  To: Mark Brown, Apurva Nandan, Dhruva Gole
  Cc: linux-spi, linux-kernel, Gregory CLEMENT, Vladimir Kondratiev,
	Thomas Petazzoni, Tawfik Bayouk, Théo Lebrun

Each SPI controller is expected to call the spi_controller_suspend() and
spi_controller_resume() callbacks at system-wide suspend and resume.

It (1) handles the kthread worker for queued controllers and (2) marks
the controller as suspended to have spi_sync() fail while the
controller is unavailable.

Those two operations do not require the controller to be active, we do
not need to increment the runtime PM usage counter.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/spi/spi-cadence-quadspi.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index ee14965142ba..8bcbab90cb75 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1949,8 +1949,24 @@ static int cqspi_runtime_resume(struct device *dev)
 	return 0;
 }
 
-static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_runtime_suspend,
-				 cqspi_runtime_resume, NULL);
+static int cqspi_suspend(struct device *dev)
+{
+	struct cqspi_st *cqspi = dev_get_drvdata(dev);
+
+	return spi_controller_suspend(cqspi->host);
+}
+
+static int cqspi_resume(struct device *dev)
+{
+	struct cqspi_st *cqspi = dev_get_drvdata(dev);
+
+	return spi_controller_resume(cqspi->host);
+}
+
+static const struct dev_pm_ops cqspi_dev_pm_ops = {
+	RUNTIME_PM_OPS(cqspi_runtime_suspend, cqspi_runtime_resume, NULL)
+	SYSTEM_SLEEP_PM_OPS(cqspi_suspend, cqspi_resume)
+};
 
 static const struct cqspi_driver_platdata cdns_qspi = {
 	.quirks = CQSPI_DISABLE_DAC_MODE,

-- 
2.43.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/4] spi: cadence-qspi: put runtime in runtime PM hooks names
  2024-02-22 10:12 ` [PATCH v4 3/4] spi: cadence-qspi: put runtime in runtime PM hooks names Théo Lebrun
@ 2024-02-22 10:22   ` Dhruva Gole
  0 siblings, 0 replies; 14+ messages in thread
From: Dhruva Gole @ 2024-02-22 10:22 UTC (permalink / raw)
  To: Théo Lebrun
  Cc: Mark Brown, Apurva Nandan, linux-spi, linux-kernel,
	Gregory CLEMENT, Vladimir Kondratiev, Thomas Petazzoni,
	Tawfik Bayouk

Hi,

On Feb 22, 2024 at 11:12:31 +0100, Théo Lebrun wrote:
> Follow kernel naming convention with regards to power-management
> callback function names.
> 
> The convention in the kernel is:
>  - prefix_suspend means the system-wide suspend callback;
>  - prefix_runtime_suspend means the runtime PM suspend callback.
> The same applies to resume callbacks.
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> ---

LGTM!

Reviewed-by: Dhruva Gole <d-gole@ti.com>



-- 
Best regards,
Dhruva Gole <d-gole@ti.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
  2024-02-22 10:12 [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Théo Lebrun
                   ` (3 preceding siblings ...)
  2024-02-22 10:12 ` [PATCH v4 4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks Théo Lebrun
@ 2024-02-22 19:13 ` Mark Brown
  2024-02-26 12:18   ` Dhruva Gole
  4 siblings, 1 reply; 14+ messages in thread
From: Mark Brown @ 2024-02-22 19:13 UTC (permalink / raw)
  To: Apurva Nandan, Dhruva Gole, Théo Lebrun
  Cc: linux-spi, linux-kernel, Gregory CLEMENT, Vladimir Kondratiev,
	Thomas Petazzoni, Tawfik Bayouk

On Thu, 22 Feb 2024 11:12:28 +0100, Théo Lebrun wrote:
> This fixes runtime PM and system-wide suspend for the cadence-qspi
> driver. Seeing how runtime PM and autosuspend are enabled by default, I
> believe this affects all users of the driver.
> 
> This series has been tested on both Mobileye EyeQ5 hardware and the TI
> J7200 EVM board, under s2idle.
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks
      commit: 32ce3bb57b6b402de2aec1012511e7ac4e7449dc
[2/4] spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
      commit: 959043afe53ae80633e810416cee6076da6e91c6
[3/4] spi: cadence-qspi: put runtime in runtime PM hooks names
      commit: 4efa1250b59ebf47ce64a7b6b7c3e2e0a2a9d35a
[4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks
      commit: 078d62de433b4f4556bb676e5dd670f0d4103376

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
  2024-02-22 19:13 ` [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Mark Brown
@ 2024-02-26 12:18   ` Dhruva Gole
  2024-02-26 13:27     ` Mark Brown
  2024-02-26 13:36     ` Théo Lebrun
  0 siblings, 2 replies; 14+ messages in thread
From: Dhruva Gole @ 2024-02-26 12:18 UTC (permalink / raw)
  To: Mark Brown
  Cc: Apurva Nandan, Théo Lebrun, linux-spi, linux-kernel,
	Gregory CLEMENT, Vladimir Kondratiev, Thomas Petazzoni,
	Tawfik Bayouk, Nishanth, Vignesh

Hi Mark, Theo,

+ Nishanth, Vignesh (maintainers of TI K3)

On Feb 22, 2024 at 19:13:29 +0000, Mark Brown wrote:
> On Thu, 22 Feb 2024 11:12:28 +0100, Théo Lebrun wrote:
> > This fixes runtime PM and system-wide suspend for the cadence-qspi
> > driver. Seeing how runtime PM and autosuspend are enabled by default, I
> > believe this affects all users of the driver.
> > 
> > This series has been tested on both Mobileye EyeQ5 hardware and the TI
> > J7200 EVM board, under s2idle.
> > 
> > [...]
> 
> Applied to
> 
>    https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
> 
> Thanks!
> 
> [1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks
>       commit: 32ce3bb57b6b402de2aec1012511e7ac4e7449dc
> [2/4] spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
>       commit: 959043afe53ae80633e810416cee6076da6e91c6
> [3/4] spi: cadence-qspi: put runtime in runtime PM hooks names
>       commit: 4efa1250b59ebf47ce64a7b6b7c3e2e0a2a9d35a
> [4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks
>       commit: 078d62de433b4f4556bb676e5dd670f0d4103376

It seems like between 6.8.0-rc5-next-20240220 and
6.8.0-rc5-next-20240222 some of TI K3 platform boot have been broken.

It particularly seemed related to these patches because we can see
cqspi_probe in the call trace and also cqspi_suspend toward the top.

See logs for kernel crash in [0] and working in [1]


[0] https://gist.github.com/DhruvaG2000/ed997452b41d6e5301598225fc579800
[1] https://gist.github.com/DhruvaG2000/d4e73111aeafaca555ba2d5208deb6dd

-- 
Best regards,
Dhruva Gole <d-gole@ti.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
  2024-02-26 12:18   ` Dhruva Gole
@ 2024-02-26 13:27     ` Mark Brown
  2024-02-26 13:40       ` Mark Brown
  2024-02-26 13:36     ` Théo Lebrun
  1 sibling, 1 reply; 14+ messages in thread
From: Mark Brown @ 2024-02-26 13:27 UTC (permalink / raw)
  To: Dhruva Gole
  Cc: Apurva Nandan, Théo Lebrun, linux-spi, linux-kernel,
	Gregory CLEMENT, Vladimir Kondratiev, Thomas Petazzoni,
	Tawfik Bayouk, Nishanth, Vignesh

[-- Attachment #1: Type: text/plain, Size: 1993 bytes --]

On Mon, Feb 26, 2024 at 05:48:03PM +0530, Dhruva Gole wrote:
> On Feb 22, 2024 at 19:13:29 +0000, Mark Brown wrote:

> > [1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks
> >       commit: 32ce3bb57b6b402de2aec1012511e7ac4e7449dc
> > [2/4] spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
> >       commit: 959043afe53ae80633e810416cee6076da6e91c6
> > [3/4] spi: cadence-qspi: put runtime in runtime PM hooks names
> >       commit: 4efa1250b59ebf47ce64a7b6b7c3e2e0a2a9d35a
> > [4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks
> >       commit: 078d62de433b4f4556bb676e5dd670f0d4103376

> It seems like between 6.8.0-rc5-next-20240220 and
> 6.8.0-rc5-next-20240222 some of TI K3 platform boot have been broken.

Is this with some specific kernel configuration?

> It particularly seemed related to these patches because we can see
> cqspi_probe in the call trace and also cqspi_suspend toward the top.

It would be useful to bisect which patch, there's only 4 of them...

> See logs for kernel crash in [0] and working in [1]

> [0] https://gist.github.com/DhruvaG2000/ed997452b41d6e5301598225fc579800
> [1] https://gist.github.com/DhruvaG2000/d4e73111aeafaca555ba2d5208deb6dd

The relevant section from the failing log is:

[    1.516342] printk: legacy bootconsole [ns16550a0] disabled
[    1.533247] Unable to handle kernel paging request at virtual address 12800000340001b4

...

[    1.709414] Call trace:
[    1.711852]  __mutex_lock.constprop.0+0x84/0x540
[    1.716460]  __mutex_lock_slowpath+0x14/0x20
[    1.720719]  mutex_lock+0x48/0x54
[    1.724026]  spi_controller_suspend+0x30/0x7c
[    1.728377]  cqspi_suspend+0x1c/0x6c
[    1.731944]  pm_generic_runtime_suspend+0x2c/0x44
[    1.736640]  genpd_runtime_suspend+0xa8/0x254

(it's generally helpful to provide the most relevant section directly.)

The issue here appears to be that we've registered for runtime suspend
prior to registering the controller...

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
  2024-02-26 12:18   ` Dhruva Gole
  2024-02-26 13:27     ` Mark Brown
@ 2024-02-26 13:36     ` Théo Lebrun
  2024-02-27 12:00       ` Dhruva Gole
  1 sibling, 1 reply; 14+ messages in thread
From: Théo Lebrun @ 2024-02-26 13:36 UTC (permalink / raw)
  To: Dhruva Gole, Mark Brown
  Cc: Apurva Nandan, linux-spi, linux-kernel, Gregory CLEMENT,
	Vladimir Kondratiev, Thomas Petazzoni, Tawfik Bayouk, Nishanth,
	Vignesh

Hello Dhruva,

On Mon Feb 26, 2024 at 1:18 PM CET, Dhruva Gole wrote:
> Hi Mark, Theo,
>
> + Nishanth, Vignesh (maintainers of TI K3)
>
> On Feb 22, 2024 at 19:13:29 +0000, Mark Brown wrote:
> > On Thu, 22 Feb 2024 11:12:28 +0100, Théo Lebrun wrote:
> > > This fixes runtime PM and system-wide suspend for the cadence-qspi
> > > driver. Seeing how runtime PM and autosuspend are enabled by default, I
> > > believe this affects all users of the driver.
> > > 
> > > This series has been tested on both Mobileye EyeQ5 hardware and the TI
> > > J7200 EVM board, under s2idle.
> > > 
> > > [...]
> > 
> > Applied to
> > 
> >    https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
> > 
> > Thanks!
> > 
> > [1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks
> >       commit: 32ce3bb57b6b402de2aec1012511e7ac4e7449dc
> > [2/4] spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
> >       commit: 959043afe53ae80633e810416cee6076da6e91c6
> > [3/4] spi: cadence-qspi: put runtime in runtime PM hooks names
> >       commit: 4efa1250b59ebf47ce64a7b6b7c3e2e0a2a9d35a
> > [4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks
> >       commit: 078d62de433b4f4556bb676e5dd670f0d4103376
>
> It seems like between 6.8.0-rc5-next-20240220 and
> 6.8.0-rc5-next-20240222 some of TI K3 platform boot have been broken.
>
> It particularly seemed related to these patches because we can see
> cqspi_probe in the call trace and also cqspi_suspend toward the top.
>
> See logs for kernel crash in [0] and working in [1]

I'm guessing we are talking about tags next-20240220 and next-20240222
on: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/

Neither of those tags include the patches about fixing PM hooks.

   ⟩ # next-20240220
   ⟩ git log --oneline --author theo.lebrun 2d5c7b7eb345 \
      drivers/spi/spi-cadence-quadspi.c

   ⟩ # next-20240222
   ⟩ git log --oneline --author theo.lebrun e31185ce00a9 \
      drivers/spi/spi-cadence-quadspi.c
   0f3841a5e115 spi: cadence-qspi: report correct number of chip-select
   7cc3522aedb5 spi: cadence-qspi: set maximum chip-select to 4
   0d62c64a8e48 spi: cadence-qspi: assert each subnode flash CS is valid
   ⟩ # Those are unrelated patches.

Also it shows from the calltrace: this series renames the runtime
suspend/resume hooks to cqspi_runtime_* while the callstack you gave
talks about cqspi_suspend. It only gets called at system-wide suspend
following this series.

My guess is that this series will rather fix the issue that you are now
facing. :-) Could you try applying them and checking if that fixes your
error?

Regards,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
  2024-02-26 13:27     ` Mark Brown
@ 2024-02-26 13:40       ` Mark Brown
  2024-02-26 13:42         ` Théo Lebrun
  2024-02-27  5:03         ` Dhruva Gole
  0 siblings, 2 replies; 14+ messages in thread
From: Mark Brown @ 2024-02-26 13:40 UTC (permalink / raw)
  To: Dhruva Gole
  Cc: Apurva Nandan, Théo Lebrun, linux-spi, linux-kernel,
	Gregory CLEMENT, Vladimir Kondratiev, Thomas Petazzoni,
	Tawfik Bayouk, Nishanth, Vignesh

[-- Attachment #1: Type: text/plain, Size: 902 bytes --]

On Mon, Feb 26, 2024 at 01:27:57PM +0000, Mark Brown wrote:
> On Mon, Feb 26, 2024 at 05:48:03PM +0530, Dhruva Gole wrote:
> > On Feb 22, 2024 at 19:13:29 +0000, Mark Brown wrote:

> [    1.709414] Call trace:
> [    1.711852]  __mutex_lock.constprop.0+0x84/0x540
> [    1.716460]  __mutex_lock_slowpath+0x14/0x20
> [    1.720719]  mutex_lock+0x48/0x54
> [    1.724026]  spi_controller_suspend+0x30/0x7c
> [    1.728377]  cqspi_suspend+0x1c/0x6c
> [    1.731944]  pm_generic_runtime_suspend+0x2c/0x44
> [    1.736640]  genpd_runtime_suspend+0xa8/0x254

> (it's generally helpful to provide the most relevant section directly.)

> The issue here appears to be that we've registered for runtime suspend
> prior to registering the controller...

Actually, no - after this series cqspi_suspend() is the system not
runtime PM operation and should not be called from runtime suspend.  How
is that happening?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
  2024-02-26 13:40       ` Mark Brown
@ 2024-02-26 13:42         ` Théo Lebrun
  2024-02-27  5:03         ` Dhruva Gole
  1 sibling, 0 replies; 14+ messages in thread
From: Théo Lebrun @ 2024-02-26 13:42 UTC (permalink / raw)
  To: Mark Brown, Dhruva Gole
  Cc: Apurva Nandan, linux-spi, linux-kernel, Gregory CLEMENT,
	Vladimir Kondratiev, Thomas Petazzoni, Tawfik Bayouk, Nishanth,
	Vignesh

Hello,

On Mon Feb 26, 2024 at 2:40 PM CET, Mark Brown wrote:
> On Mon, Feb 26, 2024 at 01:27:57PM +0000, Mark Brown wrote:
> > On Mon, Feb 26, 2024 at 05:48:03PM +0530, Dhruva Gole wrote:
> > > On Feb 22, 2024 at 19:13:29 +0000, Mark Brown wrote:
>
> > [    1.709414] Call trace:
> > [    1.711852]  __mutex_lock.constprop.0+0x84/0x540
> > [    1.716460]  __mutex_lock_slowpath+0x14/0x20
> > [    1.720719]  mutex_lock+0x48/0x54
> > [    1.724026]  spi_controller_suspend+0x30/0x7c
> > [    1.728377]  cqspi_suspend+0x1c/0x6c
> > [    1.731944]  pm_generic_runtime_suspend+0x2c/0x44
> > [    1.736640]  genpd_runtime_suspend+0xa8/0x254
>
> > (it's generally helpful to provide the most relevant section directly.)
>
> > The issue here appears to be that we've registered for runtime suspend
> > prior to registering the controller...
>
> Actually, no - after this series cqspi_suspend() is the system not
> runtime PM operation and should not be called from runtime suspend.  How
> is that happening?

You might have seen my answer by now. This series is not in the tags
quoted. I believe the memory corruption I fixed with this series is
being encountered for the first time on TI hardware. They probably did
not encounter it previously by luck.

Regards,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
  2024-02-26 13:40       ` Mark Brown
  2024-02-26 13:42         ` Théo Lebrun
@ 2024-02-27  5:03         ` Dhruva Gole
  1 sibling, 0 replies; 14+ messages in thread
From: Dhruva Gole @ 2024-02-27  5:03 UTC (permalink / raw)
  To: Mark Brown
  Cc: Apurva Nandan, Théo Lebrun, linux-spi, linux-kernel,
	Gregory CLEMENT, Vladimir Kondratiev, Thomas Petazzoni,
	Tawfik Bayouk, Nishanth, Vignesh

Hi,

On Feb 26, 2024 at 13:40:00 +0000, Mark Brown wrote:
> On Mon, Feb 26, 2024 at 01:27:57PM +0000, Mark Brown wrote:
> > On Mon, Feb 26, 2024 at 05:48:03PM +0530, Dhruva Gole wrote:
> > > On Feb 22, 2024 at 19:13:29 +0000, Mark Brown wrote:
> 
> > [    1.709414] Call trace:
> > [    1.711852]  __mutex_lock.constprop.0+0x84/0x540
> > [    1.716460]  __mutex_lock_slowpath+0x14/0x20
> > [    1.720719]  mutex_lock+0x48/0x54
> > [    1.724026]  spi_controller_suspend+0x30/0x7c
> > [    1.728377]  cqspi_suspend+0x1c/0x6c
> > [    1.731944]  pm_generic_runtime_suspend+0x2c/0x44
> > [    1.736640]  genpd_runtime_suspend+0xa8/0x254
> 
> > (it's generally helpful to provide the most relevant section directly.)
> 
> > The issue here appears to be that we've registered for runtime suspend
> > prior to registering the controller...
> 
> Actually, no - after this series cqspi_suspend() is the system not
> runtime PM operation and should not be called from runtime suspend.  How
> is that happening?

I tried dropping this entire series, it doesn't really solve the kernel
boot issues. Also this particular stack dump isn't easily reproducible
either. Perhaps this series may not be the rootcause, I will need some
more time to see what's breaking boot for us.

But for now this series seems to be in the clear. Will keep you posted
if I find anything funny here.

FYI- We're just using the arm64 defconfig and respective device DTs


-- 
Best regards,
Dhruva Gole <d-gole@ti.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend
  2024-02-26 13:36     ` Théo Lebrun
@ 2024-02-27 12:00       ` Dhruva Gole
  0 siblings, 0 replies; 14+ messages in thread
From: Dhruva Gole @ 2024-02-27 12:00 UTC (permalink / raw)
  To: Théo Lebrun
  Cc: Mark Brown, Apurva Nandan, linux-spi, linux-kernel,
	Gregory CLEMENT, Vladimir Kondratiev, Thomas Petazzoni,
	Tawfik Bayouk, Nishanth, Vignesh

Hi,

On Feb 26, 2024 at 14:36:17 +0100, Théo Lebrun wrote:
> Hello Dhruva,
> 
> On Mon Feb 26, 2024 at 1:18 PM CET, Dhruva Gole wrote:
> > Hi Mark, Theo,
> >
> > + Nishanth, Vignesh (maintainers of TI K3)
> >
> > On Feb 22, 2024 at 19:13:29 +0000, Mark Brown wrote:
> > > On Thu, 22 Feb 2024 11:12:28 +0100, Théo Lebrun wrote:
> > > > This fixes runtime PM and system-wide suspend for the cadence-qspi
> > > > driver. Seeing how runtime PM and autosuspend are enabled by default, I
> > > > believe this affects all users of the driver.
> > > > 
> > > > This series has been tested on both Mobileye EyeQ5 hardware and the TI
> > > > J7200 EVM board, under s2idle.
> > > > 
> > > > [...]
> > > 
> > > Applied to
> > > 
> > >    https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
> > > 
> > > Thanks!
> > > 
> > > [1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks
> > >       commit: 32ce3bb57b6b402de2aec1012511e7ac4e7449dc
> > > [2/4] spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
> > >       commit: 959043afe53ae80633e810416cee6076da6e91c6
> > > [3/4] spi: cadence-qspi: put runtime in runtime PM hooks names
> > >       commit: 4efa1250b59ebf47ce64a7b6b7c3e2e0a2a9d35a
> > > [4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks
> > >       commit: 078d62de433b4f4556bb676e5dd670f0d4103376
> >
> > It seems like between 6.8.0-rc5-next-20240220 and
> > 6.8.0-rc5-next-20240222 some of TI K3 platform boot have been broken.
> >
> > It particularly seemed related to these patches because we can see
> > cqspi_probe in the call trace and also cqspi_suspend toward the top.
> >
> > See logs for kernel crash in [0] and working in [1]
> 
> I'm guessing we are talking about tags next-20240220 and next-20240222
> on: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/
> 
> Neither of those tags include the patches about fixing PM hooks.
> 
>    ⟩ # next-20240220
>    ⟩ git log --oneline --author theo.lebrun 2d5c7b7eb345 \
>       drivers/spi/spi-cadence-quadspi.c
> 
>    ⟩ # next-20240222
>    ⟩ git log --oneline --author theo.lebrun e31185ce00a9 \
>       drivers/spi/spi-cadence-quadspi.c
>    0f3841a5e115 spi: cadence-qspi: report correct number of chip-select
>    7cc3522aedb5 spi: cadence-qspi: set maximum chip-select to 4
>    0d62c64a8e48 spi: cadence-qspi: assert each subnode flash CS is valid
>    ⟩ # Those are unrelated patches.
> 
> Also it shows from the calltrace: this series renames the runtime
> suspend/resume hooks to cqspi_runtime_* while the callstack you gave
> talks about cqspi_suspend. It only gets called at system-wide suspend
> following this series.
> 
> My guess is that this series will rather fix the issue that you are now
> facing. :-) Could you try applying them and checking if that fixes your
> error?

Indeed, it seems like kernelci generated 22Feb and no future builds in
our case hence we were not testing the -next with your patches applied.

Please pardon the confusion.

The boot logs are here with local linux build from 27 Feb -next:

https://gist.github.com/DhruvaG2000/78ef6f2953b0940ef8ea38797f2ec6cb

It does seem like these patches help us fix the previous regressions.
Thanks for the fixes.


-- 
Best regards,
Dhruva Gole <d-gole@ti.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-02-27 12:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-22 10:12 [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Théo Lebrun
2024-02-22 10:12 ` [PATCH v4 1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks Théo Lebrun
2024-02-22 10:12 ` [PATCH v4 2/4] spi: cadence-qspi: remove system-wide suspend helper calls from " Théo Lebrun
2024-02-22 10:12 ` [PATCH v4 3/4] spi: cadence-qspi: put runtime in runtime PM hooks names Théo Lebrun
2024-02-22 10:22   ` Dhruva Gole
2024-02-22 10:12 ` [PATCH v4 4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks Théo Lebrun
2024-02-22 19:13 ` [PATCH v4 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend Mark Brown
2024-02-26 12:18   ` Dhruva Gole
2024-02-26 13:27     ` Mark Brown
2024-02-26 13:40       ` Mark Brown
2024-02-26 13:42         ` Théo Lebrun
2024-02-27  5:03         ` Dhruva Gole
2024-02-26 13:36     ` Théo Lebrun
2024-02-27 12:00       ` Dhruva Gole

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).