linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before system suspend
@ 2021-08-18  2:49 Bard Liao
  2021-08-18  2:49 ` [PATCH v2 1/3] soundwire: intel: fix potential race condition during power down Bard Liao
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Bard Liao @ 2021-08-18  2:49 UTC (permalink / raw)
  To: alsa-devel, vkoul
  Cc: vinod.koul, linux-kernel, gregkh, srinivas.kandagatla,
	rander.wang, pierre-louis.bossart, sanyog.r.kale, bard.liao

Intel validation reported an issue where the HW_RST self-clearing bit
is not cleared in hardware, which as a ripple effect creates issues
with the clock stop mode.

This happens is a specific sequence where the Intel manager is
pm_runtime suspended with the clock-stop mode enabled. During the
system suspend, we currently do nothing, which can lead to potential
issues on system resume and the following pm_runtime suspend,
depending on the hardware state.

This patch suggests a full resume if the clock-stop mode is used. This
may require extra time but will make the suspend/resume flows
completely symmetric. This also removes a race condition where we
could not access SHIM registers if the parent was suspended as
well. Resuming the link also resumes the parent by construction.

BugLink: https://github.com/thesofproject/linux/issues/2606

v2:
 - Better comments and commit messages.
 - Modified the .prepare callback to only deal with the corner case that is
   NOT covered today instead of systematically doing a full resume.

Pierre-Louis Bossart (3):
  soundwire: intel: fix potential race condition during power down
  soundwire: intel: skip suspend/resume/wake when link was not started
  soundwire: intel: conditionally exit clock stop mode on system suspend

 drivers/soundwire/intel.c | 150 ++++++++++++++++++++++++++++++--------
 drivers/soundwire/intel.h |   1 +
 2 files changed, 119 insertions(+), 32 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/3] soundwire: intel: fix potential race condition during power down
  2021-08-18  2:49 [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before system suspend Bard Liao
@ 2021-08-18  2:49 ` Bard Liao
  2021-08-18  2:49 ` [PATCH v2 2/3] soundwire: intel: skip suspend/resume/wake when link was not started Bard Liao
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Bard Liao @ 2021-08-18  2:49 UTC (permalink / raw)
  To: alsa-devel, vkoul
  Cc: vinod.koul, linux-kernel, gregkh, srinivas.kandagatla,
	rander.wang, pierre-louis.bossart, sanyog.r.kale, bard.liao

From: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>

The power down sequence sets the link_up flag as false outside of the
mutex_lock. This is potentially unsafe.

In additional the flow in that sequence can be improved by first
testing if the link was powered, setting the link_up flag as false and
proceeding with the power down. In case the CPA bits cannot be
cleared, we only flag an error since we cannot deal with interrupts
any longer.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
---
 drivers/soundwire/intel.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
index 9794bc222fb5..808dda1380c2 100644
--- a/drivers/soundwire/intel.c
+++ b/drivers/soundwire/intel.c
@@ -538,12 +538,14 @@ static int intel_link_power_down(struct sdw_intel *sdw)
 
 	mutex_lock(sdw->link_res->shim_lock);
 
-	intel_shim_master_ip_to_glue(sdw);
-
 	if (!(*shim_mask & BIT(link_id)))
 		dev_err(sdw->cdns.dev,
 			"%s: Unbalanced power-up/down calls\n", __func__);
 
+	sdw->cdns.link_up = false;
+
+	intel_shim_master_ip_to_glue(sdw);
+
 	*shim_mask &= ~BIT(link_id);
 
 	if (!*shim_mask) {
@@ -560,18 +562,19 @@ static int intel_link_power_down(struct sdw_intel *sdw)
 		link_control &=  spa_mask;
 
 		ret = intel_clear_bit(shim, SDW_SHIM_LCTL, link_control, cpa_mask);
+		if (ret < 0) {
+			dev_err(sdw->cdns.dev, "%s: could not power down link\n", __func__);
+
+			/*
+			 * we leave the sdw->cdns.link_up flag as false since we've disabled
+			 * the link at this point and cannot handle interrupts any longer.
+			 */
+		}
 	}
 
 	mutex_unlock(sdw->link_res->shim_lock);
 
-	if (ret < 0) {
-		dev_err(sdw->cdns.dev, "%s: could not power down link\n", __func__);
-
-		return ret;
-	}
-
-	sdw->cdns.link_up = false;
-	return 0;
+	return ret;
 }
 
 static void intel_shim_sync_arm(struct sdw_intel *sdw)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/3] soundwire: intel: skip suspend/resume/wake when link was not started
  2021-08-18  2:49 [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before system suspend Bard Liao
  2021-08-18  2:49 ` [PATCH v2 1/3] soundwire: intel: fix potential race condition during power down Bard Liao
@ 2021-08-18  2:49 ` Bard Liao
  2021-08-18  2:49 ` [PATCH v2 3/3] soundwire: intel: conditionally exit clock stop mode on system suspend Bard Liao
  2021-08-23 12:12 ` [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before " Vinod Koul
  3 siblings, 0 replies; 5+ messages in thread
From: Bard Liao @ 2021-08-18  2:49 UTC (permalink / raw)
  To: alsa-devel, vkoul
  Cc: vinod.koul, linux-kernel, gregkh, srinivas.kandagatla,
	rander.wang, pierre-louis.bossart, sanyog.r.kale, bard.liao

From: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>

The SoundWire Linux devices are created purely based on information
provided by platform firmware (e.g. ACPI DSDT table). When the kernel
finds a matching driver for the device address (_ADR), the probe will
initialize required data structures and initialize pm ops.

When the SoundWire link is started at a later point, the physical
devices will synchronize on the SoundWire frames and report their
attachment status, thereby triggering the enumeration and
initialization of device registers.

This two-step solution was a conscious design decision to allow e.g. a
driver to use sideband mechanisms to turn power rails on. This can
also allow OEMs to describe multiple platforms with the same DSDT
table, the devices that are not physically present in hardware.

The drawback of this approach is a bit of confusion, with more devices
than are actually present in hardware. This results in 'ghost'
devices, for which the driver successfully probes, but that will not
generate any traffic on the bus. suspend-resume transitions are
handled by drivers, and skipped when the devices are not physically
present.

This patch provides a work-around for a second-level of confusion in
platform firmware: some platforms only use HDaudio links, but
nevertheless expose SoundWire 'ghost' devices. This results in error
messages in the Intel driver while trying to suspend/resume these
links. The simplest solution is to add a boolean status flag to skip
all suspend/resume/wake sequences if the link was never started.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
---
 drivers/soundwire/intel.c | 22 ++++++++++++----------
 drivers/soundwire/intel.h |  1 +
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
index 808dda1380c2..8b42053b171f 100644
--- a/drivers/soundwire/intel.c
+++ b/drivers/soundwire/intel.c
@@ -1525,6 +1525,7 @@ int intel_link_startup(struct auxiliary_device *auxdev)
 	if (!(link_flags & SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE))
 		pm_runtime_idle(dev);
 
+	sdw->startup_done = true;
 	return 0;
 
 err_interrupt:
@@ -1564,8 +1565,9 @@ int intel_link_process_wakeen_event(struct auxiliary_device *auxdev)
 	sdw = dev_get_drvdata(dev);
 	bus = &sdw->cdns.bus;
 
-	if (bus->prop.hw_disabled) {
-		dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n", bus->link_id);
+	if (bus->prop.hw_disabled || !sdw->startup_done) {
+		dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n",
+			bus->link_id);
 		return 0;
 	}
 
@@ -1602,8 +1604,8 @@ static int __maybe_unused intel_suspend(struct device *dev)
 	u32 clock_stop_quirks;
 	int ret;
 
-	if (bus->prop.hw_disabled) {
-		dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n",
+	if (bus->prop.hw_disabled || !sdw->startup_done) {
+		dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n",
 			bus->link_id);
 		return 0;
 	}
@@ -1656,8 +1658,8 @@ static int __maybe_unused intel_suspend_runtime(struct device *dev)
 	u32 clock_stop_quirks;
 	int ret;
 
-	if (bus->prop.hw_disabled) {
-		dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n",
+	if (bus->prop.hw_disabled || !sdw->startup_done) {
+		dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n",
 			bus->link_id);
 		return 0;
 	}
@@ -1721,8 +1723,8 @@ static int __maybe_unused intel_resume(struct device *dev)
 	bool multi_link;
 	int ret;
 
-	if (bus->prop.hw_disabled) {
-		dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n",
+	if (bus->prop.hw_disabled || !sdw->startup_done) {
+		dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n",
 			bus->link_id);
 		return 0;
 	}
@@ -1819,8 +1821,8 @@ static int __maybe_unused intel_resume_runtime(struct device *dev)
 	int status;
 	int ret;
 
-	if (bus->prop.hw_disabled) {
-		dev_dbg(dev, "SoundWire master %d is disabled, ignoring\n",
+	if (bus->prop.hw_disabled || !sdw->startup_done) {
+		dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n",
 			bus->link_id);
 		return 0;
 	}
diff --git a/drivers/soundwire/intel.h b/drivers/soundwire/intel.h
index 0b47b148da3f..cd93a44dba9a 100644
--- a/drivers/soundwire/intel.h
+++ b/drivers/soundwire/intel.h
@@ -41,6 +41,7 @@ struct sdw_intel {
 	struct sdw_cdns cdns;
 	int instance;
 	struct sdw_intel_link_res *link_res;
+	bool startup_done;
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *debugfs;
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 3/3] soundwire: intel: conditionally exit clock stop mode on system suspend
  2021-08-18  2:49 [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before system suspend Bard Liao
  2021-08-18  2:49 ` [PATCH v2 1/3] soundwire: intel: fix potential race condition during power down Bard Liao
  2021-08-18  2:49 ` [PATCH v2 2/3] soundwire: intel: skip suspend/resume/wake when link was not started Bard Liao
@ 2021-08-18  2:49 ` Bard Liao
  2021-08-23 12:12 ` [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before " Vinod Koul
  3 siblings, 0 replies; 5+ messages in thread
From: Bard Liao @ 2021-08-18  2:49 UTC (permalink / raw)
  To: alsa-devel, vkoul
  Cc: vinod.koul, linux-kernel, gregkh, srinivas.kandagatla,
	rander.wang, pierre-louis.bossart, sanyog.r.kale, bard.liao

From: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>

Intel stress tests reported issues with the clock stop mode,
specifically when trying to do a system suspend while the link is
already pm_runtime suspended.

In this case, we need to disable the shim wake, but when the PCI
parent device is also pm_runtime suspended the SHIM registers are not
accessible.

Since this is an invalid corner case, this patch suggests a pm_runtime
resume of the entire bus to full power (parent+child devices) before
the system suspend so that the shim wake can be disabled.

Unlike the suspend operation, the .prepare callbacks are propagated
from root device to leaf devices. By adding a .prepare callback at the
SoundWire link level, we can double-check the pm_runtime status of the
device as well as its parent PCI device. When the problematic
configuration is detected, the device is pm_runtime resumed - which by
construction also resume its parent.

An additional loop is added to resume all child devices. In theory we
only need to restart the link, but doing so will also cause the
physical devices to synchronize and re-initialize, while their Linux
devices remain pm_runtime suspended. It's simpler to make sure the
codec devices are fully resumed so that we don't have to deal with
zombie states.

This additional loop could have been avoided by adding a .prepare
callback in SoundWire codec drivers. Functionally this would have been
equivalent. The rationale for implementing a loop at the link level is
only to reduce the amount of code required to deal at the codec level
with an Intel corner case - in other words keep codec drivers
independent from Intel platform-specific programming sequences.

BugLink: https://github.com/thesofproject/linux/issues/2606
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
---
 drivers/soundwire/intel.c | 105 +++++++++++++++++++++++++++++++++-----
 1 file changed, 93 insertions(+), 12 deletions(-)

diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
index 8b42053b171f..f66fcbc33a2f 100644
--- a/drivers/soundwire/intel.c
+++ b/drivers/soundwire/intel.c
@@ -1596,6 +1596,87 @@ int intel_link_process_wakeen_event(struct auxiliary_device *auxdev)
  * PM calls
  */
 
+static int intel_resume_child_device(struct device *dev, void *data)
+{
+	int ret;
+	struct sdw_slave *slave = dev_to_sdw_dev(dev);
+
+	if (!slave->probed) {
+		dev_dbg(dev, "%s: skipping device, no probed driver\n", __func__);
+		return 0;
+	}
+	if (!slave->dev_num_sticky) {
+		dev_dbg(dev, "%s: skipping device, never detected on bus\n", __func__);
+		return 0;
+	}
+
+	ret = pm_request_resume(dev);
+	if (ret < 0)
+		dev_err(dev, "%s: pm_request_resume failed: %d\n", __func__, ret);
+
+	return ret;
+}
+
+static int __maybe_unused intel_pm_prepare(struct device *dev)
+{
+	struct sdw_cdns *cdns = dev_get_drvdata(dev);
+	struct sdw_intel *sdw = cdns_to_intel(cdns);
+	struct sdw_bus *bus = &cdns->bus;
+	u32 clock_stop_quirks;
+	int ret = 0;
+
+	if (bus->prop.hw_disabled || !sdw->startup_done) {
+		dev_dbg(dev, "SoundWire master %d is disabled or not-started, ignoring\n",
+			bus->link_id);
+		return 0;
+	}
+
+	clock_stop_quirks = sdw->link_res->clock_stop_quirks;
+
+	if (pm_runtime_suspended(dev) &&
+	    pm_runtime_suspended(dev->parent) &&
+	    ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET) ||
+	     !clock_stop_quirks)) {
+		/*
+		 * if we've enabled clock stop, and the parent is suspended, the SHIM registers
+		 * are not accessible and the shim wake cannot be disabled.
+		 * The only solution is to resume the entire bus to full power
+		 */
+
+		/*
+		 * If any operation in this block fails, we keep going since we don't want
+		 * to prevent system suspend from happening and errors should be recoverable
+		 * on resume.
+		 */
+
+		/*
+		 * first resume the device for this link. This will also by construction
+		 * resume the PCI parent device.
+		 */
+		ret = pm_request_resume(dev);
+		if (ret < 0) {
+			dev_err(dev, "%s: pm_request_resume failed: %d\n", __func__, ret);
+			return 0;
+		}
+
+		/*
+		 * Continue resuming the entire bus (parent + child devices) to exit
+		 * the clock stop mode. If there are no devices connected on this link
+		 * this is a no-op.
+		 * The resume to full power could have been implemented with a .prepare
+		 * step in SoundWire codec drivers. This would however require a lot
+		 * of code to handle an Intel-specific corner case. It is simpler in
+		 * practice to add a loop at the link level.
+		 */
+		ret = device_for_each_child(bus->dev, NULL, intel_resume_child_device);
+
+		if (ret < 0)
+			dev_err(dev, "%s: intel_resume_child_device failed: %d\n", __func__, ret);
+	}
+
+	return 0;
+}
+
 static int __maybe_unused intel_suspend(struct device *dev)
 {
 	struct sdw_cdns *cdns = dev_get_drvdata(dev);
@@ -1615,19 +1696,18 @@ static int __maybe_unused intel_suspend(struct device *dev)
 
 		clock_stop_quirks = sdw->link_res->clock_stop_quirks;
 
-		if ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET ||
-		     !clock_stop_quirks) &&
-		    !pm_runtime_suspended(dev->parent)) {
+		if ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET) ||
+		    !clock_stop_quirks) {
 
-			/*
-			 * if we've enabled clock stop, and the parent
-			 * is still active, disable shim wake. The
-			 * SHIM registers are not accessible if the
-			 * parent is already pm_runtime suspended so
-			 * it's too late to change that configuration
-			 */
-
-			intel_shim_wake(sdw, false);
+			if (pm_runtime_suspended(dev->parent)) {
+				/*
+				 * paranoia check: this should not happen with the .prepare
+				 * resume to full power
+				 */
+				dev_err(dev, "%s: invalid config: parent is suspended\n", __func__);
+			} else {
+				intel_shim_wake(sdw, false);
+			}
 		}
 
 		return 0;
@@ -1992,6 +2072,7 @@ static int __maybe_unused intel_resume_runtime(struct device *dev)
 }
 
 static const struct dev_pm_ops intel_pm = {
+	.prepare = intel_pm_prepare,
 	SET_SYSTEM_SLEEP_PM_OPS(intel_suspend, intel_resume)
 	SET_RUNTIME_PM_OPS(intel_suspend_runtime, intel_resume_runtime, NULL)
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before system suspend
  2021-08-18  2:49 [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before system suspend Bard Liao
                   ` (2 preceding siblings ...)
  2021-08-18  2:49 ` [PATCH v2 3/3] soundwire: intel: conditionally exit clock stop mode on system suspend Bard Liao
@ 2021-08-23 12:12 ` Vinod Koul
  3 siblings, 0 replies; 5+ messages in thread
From: Vinod Koul @ 2021-08-23 12:12 UTC (permalink / raw)
  To: Bard Liao
  Cc: alsa-devel, linux-kernel, gregkh, srinivas.kandagatla,
	rander.wang, pierre-louis.bossart, sanyog.r.kale, bard.liao

On 18-08-21, 10:49, Bard Liao wrote:
> Intel validation reported an issue where the HW_RST self-clearing bit
> is not cleared in hardware, which as a ripple effect creates issues
> with the clock stop mode.
> 
> This happens is a specific sequence where the Intel manager is
> pm_runtime suspended with the clock-stop mode enabled. During the
> system suspend, we currently do nothing, which can lead to potential
> issues on system resume and the following pm_runtime suspend,
> depending on the hardware state.
> 
> This patch suggests a full resume if the clock-stop mode is used. This
> may require extra time but will make the suspend/resume flows
> completely symmetric. This also removes a race condition where we
> could not access SHIM registers if the parent was suspended as
> well. Resuming the link also resumes the parent by construction.

Applied all, thanks

-- 
~Vinod

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-23 12:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-18  2:49 [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before system suspend Bard Liao
2021-08-18  2:49 ` [PATCH v2 1/3] soundwire: intel: fix potential race condition during power down Bard Liao
2021-08-18  2:49 ` [PATCH v2 2/3] soundwire: intel: skip suspend/resume/wake when link was not started Bard Liao
2021-08-18  2:49 ` [PATCH v2 3/3] soundwire: intel: conditionally exit clock stop mode on system suspend Bard Liao
2021-08-23 12:12 ` [PATCH v2 0/3] soundwire: intel: exit clock-stop mode before " Vinod Koul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).