[4/6] ACPI: LPSS: Fix ->suspend_late callbacks handling
diff mbox series

Message ID 2981101.tKVHzisTAg@kreacher
State Superseded
Headers show
Series
  • PM: PCI/ACPI: Hibernation handling fixes
Related show

Commit Message

Rafael J. Wysocki June 29, 2019, 9:50 a.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If the resume_from_noirq flag is set in dev_desc, the ->suspend_late
callback provided by the device driver will be invoked at the "noirq"
stage of system suspend, via acpi_lpss_do_suspend_late(), which is
incorrect.

To fix that, drop acpi_lpss_do_suspend_late() and rearrange
acpi_lpss_suspend_late() to call pm_generic_suspend_late()
directly, before calling acpi_lpss_suspend(), in analogy with
acpi_subsys_suspend_late().

Also notice that acpi_subsys_suspend_late() is not used outside
of the file where it is defined, so make it static and drop its
header for the header file containing it.

Fixes: 48402cee6889 (ACPI / LPSS: Resume BYT/CHT I2C controllers from resume_noirq)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/acpi_lpss.c |   23 ++++++++++-------------
 drivers/acpi/device_pm.c |    3 +--
 include/linux/acpi.h     |    2 --
 3 files changed, 11 insertions(+), 17 deletions(-)

Comments

Hans de Goede June 29, 2019, 11:34 a.m. UTC | #1
Hi Rafael,

On 29-06-19 11:50, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If the resume_from_noirq flag is set in dev_desc, the ->suspend_late
> callback provided by the device driver will be invoked at the "noirq"
> stage of system suspend, via acpi_lpss_do_suspend_late(), which is
> incorrect.
> 
> To fix that, drop acpi_lpss_do_suspend_late() and rearrange
> acpi_lpss_suspend_late() to call pm_generic_suspend_late()
> directly, before calling acpi_lpss_suspend(), in analogy with
> acpi_subsys_suspend_late().

Ah now I see the logic in your previous test-patch.

I'm afraid that this is going to break things though, the calling
of the device-driver's suspend-late method at noirq time is
*intentional* !

The resume_from_noirq flag is only set for i2c controllers which
use: drivers/i2c/busses/i2c-designware-platdrv.c as driver.

This driver's suspend late method looks like this:

static int dw_i2c_plat_suspend(struct device *dev)
{
         struct dw_i2c_dev *i_dev = dev_get_drvdata(dev);

         i_dev->suspended = true;

         if (i_dev->shared_with_punit)
                 return 0;

         i_dev->disable(i_dev);
         i2c_dw_prepare_clk(i_dev, false);

         return 0;
}

The i_dev->disable(i_dev) and i2c_dw_prepare_clk(i_dev, false) calls here
will make the i2c controller non functional. But (some of) these i2c
controllers are used by code in the _PS0  / _PS3 methods of some PCI
devices and the PCI core calls _PS0 / _PS3 at *noirq* time, so as explained
in the commit message which introduced acpi_lpss_do_suspend_late():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=48402cee6889fb3fce58e95fea1471626286dc63

We must not only make sure that the suspending of the i2c controller is
ordered so that it happens after these PCI devices are suspended, we must
also make sure that the i2c controller stays functional until the
i2c-controller is put in the suspend-noirq state.

If you really want to go this route, we must duplicate the resume_from_noirq
flag inside drivers/i2c/busses/i2c-designware-platdrv.c, setting it
only for acpi_lpss enumerated devices (the driver handles a whole lot more
devices) ans then make the driver's suspend_late method a no-op and instead
to the suspend from its suspend_noirq callback.

Since pm_generic_suspend_late() is just a wrapper to call dev->driver->pm->suspend_late
duplicating the resume_from_noirq flag inside i2c-designware-platdrv.c seems
unproductive.

Note that we have the same thing going on in acpi_lpss.c with resume_early vs
resume_noirq, we call the resume_early callback from acpi_lpss_resume_noirq
if the resume_from_noirq flag is set.

TL;DR: the behavior you are trying to fix here is intentional and
IMHO this patch should be dropped.

I guess we could / should do a patch adding a comment that the calling
the drivers' suspend_late / resume_early callback at noirq time is intentional
to avoid this confusing people in the future.

###

Patches 1-3 and 6 look good to me and you may add my:
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
to those.

The comments from this patch also impact patch 5 "ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS"
that should be rewritten to call the device's poweroff_late method at noirq
time if the resume_from_noirq flag is set.

I wonder if patch 5 is necessary though, the acpi_lpss code only deals with a small set
of devices, at least i2c-designware-platdrv.c uses SET_LATE_SYSTEM_SLEEP_PM_OPS so the
poweroff and suspend methods are 1 and the same. If this is true for all involved drivers
(I did not check) then patch 5 is not really necessary.

I guess if fixed to deal with resume_from_noirq the same way the suspend methods
do this patch cannot hurt and if we every add separate power_off methods to the
device drivers later we need this.

Regards,

Hans




> Also notice that acpi_subsys_suspend_late() is not used outside
> of the file where it is defined, so make it static and drop its
> header for the header file containing it.
> 
> Fixes: 48402cee6889 (ACPI / LPSS: Resume BYT/CHT I2C controllers from resume_noirq)
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/acpi/acpi_lpss.c |   23 ++++++++++-------------
>   drivers/acpi/device_pm.c |    3 +--
>   include/linux/acpi.h     |    2 --
>   3 files changed, 11 insertions(+), 17 deletions(-)
> 
> Index: linux-pm/drivers/acpi/acpi_lpss.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpi_lpss.c
> +++ linux-pm/drivers/acpi/acpi_lpss.c
> @@ -1034,34 +1034,31 @@ static int acpi_lpss_resume(struct devic
>   }
>   
>   #ifdef CONFIG_PM_SLEEP
> -static int acpi_lpss_do_suspend_late(struct device *dev)
> +static int acpi_lpss_suspend_late(struct device *dev)
>   {
> +	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
>   	int ret;
>   
>   	if (dev_pm_smart_suspend_and_suspended(dev))
>   		return 0;
>   
>   	ret = pm_generic_suspend_late(dev);
> -	return ret ? ret : acpi_lpss_suspend(dev, device_may_wakeup(dev));
> -}
> +	if (ret)
> +		return ret;
>   
> -static int acpi_lpss_suspend_late(struct device *dev)
> -{
> -	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
> -
> -	if (pdata->dev_desc->resume_from_noirq)
> -		return 0;
> +	if (!pdata->dev_desc->resume_from_noirq)
> +		return acpi_lpss_suspend(dev, device_may_wakeup(dev));
>   
> -	return acpi_lpss_do_suspend_late(dev);
> +	return 0;
>   }
>   
>   static int acpi_lpss_suspend_noirq(struct device *dev)
>   {
>   	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
> -	int ret;
>   
> -	if (pdata->dev_desc->resume_from_noirq) {
> -		ret = acpi_lpss_do_suspend_late(dev);
> +	if (!dev_pm_smart_suspend_and_suspended(dev) &&
> +	    pdata->dev_desc->resume_from_noirq) {
> +		int ret = acpi_lpss_suspend(dev, device_may_wakeup(dev));
>   		if (ret)
>   			return ret;
>   	}
> Index: linux-pm/drivers/acpi/device_pm.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/device_pm.c
> +++ linux-pm/drivers/acpi/device_pm.c
> @@ -1069,7 +1069,7 @@ EXPORT_SYMBOL_GPL(acpi_subsys_suspend);
>    * Carry out the generic late suspend procedure for @dev and use ACPI to put
>    * it into a low-power state during system transition into a sleep state.
>    */
> -int acpi_subsys_suspend_late(struct device *dev)
> +static int acpi_subsys_suspend_late(struct device *dev)
>   {
>   	int ret;
>   
> @@ -1079,7 +1079,6 @@ int acpi_subsys_suspend_late(struct devi
>   	ret = pm_generic_suspend_late(dev);
>   	return ret ? ret : acpi_dev_suspend(dev, device_may_wakeup(dev));
>   }
> -EXPORT_SYMBOL_GPL(acpi_subsys_suspend_late);
>   
>   /**
>    * acpi_subsys_suspend_noirq - Run the device driver's "noirq" suspend callback.
> Index: linux-pm/include/linux/acpi.h
> ===================================================================
> --- linux-pm.orig/include/linux/acpi.h
> +++ linux-pm/include/linux/acpi.h
> @@ -916,7 +916,6 @@ static inline int acpi_dev_pm_attach(str
>   int acpi_dev_suspend_late(struct device *dev);
>   int acpi_subsys_prepare(struct device *dev);
>   void acpi_subsys_complete(struct device *dev);
> -int acpi_subsys_suspend_late(struct device *dev);
>   int acpi_subsys_suspend_noirq(struct device *dev);
>   int acpi_subsys_suspend(struct device *dev);
>   int acpi_subsys_freeze(struct device *dev);
> @@ -924,7 +923,6 @@ int acpi_subsys_freeze(struct device *de
>   static inline int acpi_dev_resume_early(struct device *dev) { return 0; }
>   static inline int acpi_subsys_prepare(struct device *dev) { return 0; }
>   static inline void acpi_subsys_complete(struct device *dev) {}
> -static inline int acpi_subsys_suspend_late(struct device *dev) { return 0; }
>   static inline int acpi_subsys_suspend_noirq(struct device *dev) { return 0; }
>   static inline int acpi_subsys_suspend(struct device *dev) { return 0; }
>   static inline int acpi_subsys_freeze(struct device *dev) { return 0; }
> 
> 
> 
>
Rafael J. Wysocki June 29, 2019, 10:02 p.m. UTC | #2
On Sat, Jun 29, 2019 at 1:34 PM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi Rafael,
>
> On 29-06-19 11:50, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > If the resume_from_noirq flag is set in dev_desc, the ->suspend_late
> > callback provided by the device driver will be invoked at the "noirq"
> > stage of system suspend, via acpi_lpss_do_suspend_late(), which is
> > incorrect.
> >
> > To fix that, drop acpi_lpss_do_suspend_late() and rearrange
> > acpi_lpss_suspend_late() to call pm_generic_suspend_late()
> > directly, before calling acpi_lpss_suspend(), in analogy with
> > acpi_subsys_suspend_late().
>
> Ah now I see the logic in your previous test-patch.
>
> I'm afraid that this is going to break things though, the calling
> of the device-driver's suspend-late method at noirq time is
> *intentional* !

But it is a bug too.

> The resume_from_noirq flag is only set for i2c controllers which
> use: drivers/i2c/busses/i2c-designware-platdrv.c as driver.
>
> This driver's suspend late method looks like this:
>
> static int dw_i2c_plat_suspend(struct device *dev)
> {
>          struct dw_i2c_dev *i_dev = dev_get_drvdata(dev);
>
>          i_dev->suspended = true;
>
>          if (i_dev->shared_with_punit)
>                  return 0;
>
>          i_dev->disable(i_dev);
>          i2c_dw_prepare_clk(i_dev, false);
>
>          return 0;
> }
>
> The i_dev->disable(i_dev) and i2c_dw_prepare_clk(i_dev, false) calls here
> will make the i2c controller non functional. But (some of) these i2c
> controllers are used by code in the _PS0  / _PS3 methods of some PCI
> devices and the PCI core calls _PS0 / _PS3 at *noirq* time, so as explained
> in the commit message which introduced acpi_lpss_do_suspend_late():
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=48402cee6889fb3fce58e95fea1471626286dc63
>
> We must not only make sure that the suspending of the i2c controller is
> ordered so that it happens after these PCI devices are suspended, we must
> also make sure that the i2c controller stays functional until the
> i2c-controller is put in the suspend-noirq state.
>
> If you really want to go this route, we must duplicate the resume_from_noirq
> flag inside drivers/i2c/busses/i2c-designware-platdrv.c, setting it
> only for acpi_lpss enumerated devices (the driver handles a whole lot more
> devices) ans then make the driver's suspend_late method a no-op and instead
> to the suspend from its suspend_noirq callback.
>
> Since pm_generic_suspend_late() is just a wrapper to call dev->driver->pm->suspend_late
> duplicating the resume_from_noirq flag inside i2c-designware-platdrv.c seems
> unproductive.
>
> Note that we have the same thing going on in acpi_lpss.c with resume_early vs
> resume_noirq, we call the resume_early callback from acpi_lpss_resume_noirq
> if the resume_from_noirq flag is set.
>
> TL;DR: the behavior you are trying to fix here is intentional and
> IMHO this patch should be dropped.

I can drop the patch, but the current code is simply incorrect.

If the driver provided a ->suspend_late callback, it wanted that
callback to be invoked during the "late" stage of suspend.  Calling it
later simply papers over a driver bug.  If invoking that callback
during the "late" stage doesn't work, the driver should have provided
a "noirq" callback instead.

> I guess we could / should do a patch adding a comment that the calling
> the drivers' suspend_late / resume_early callback at noirq time is intentional
> to avoid this confusing people in the future.

No.  We need to fix drivers doing wrong things.

Thanks!
Rafael J. Wysocki June 30, 2019, 9:48 a.m. UTC | #3
On Sun, Jun 30, 2019 at 12:02 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Sat, Jun 29, 2019 at 1:34 PM Hans de Goede <hdegoede@redhat.com> wrote:
> >
> > Hi Rafael,
> >
> > On 29-06-19 11:50, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > If the resume_from_noirq flag is set in dev_desc, the ->suspend_late
> > > callback provided by the device driver will be invoked at the "noirq"
> > > stage of system suspend, via acpi_lpss_do_suspend_late(), which is
> > > incorrect.
> > >
> > > To fix that, drop acpi_lpss_do_suspend_late() and rearrange
> > > acpi_lpss_suspend_late() to call pm_generic_suspend_late()
> > > directly, before calling acpi_lpss_suspend(), in analogy with
> > > acpi_subsys_suspend_late().
> >
> > Ah now I see the logic in your previous test-patch.
> >
> > I'm afraid that this is going to break things though, the calling
> > of the device-driver's suspend-late method at noirq time is
> > *intentional* !
>
> But it is a bug too.

Well, not strictly a bug, but a departure from a clearly established convention.

What happens is that the driver provides ->suspend_late and
->resume_early, because the upper layer normally would remove power
and power up the device at those stages, respectively.  However, due
to dependencies between devices, the upper layer kind of works around
its own limitation.

That is not straightforward at all.

I will retain this setup in the current patch series, but going
forward it would be good to clean it up.  I wonder if using
non-PM-runtime device links to represent the dependencies would work
in this case.

Patch
diff mbox series

Index: linux-pm/drivers/acpi/acpi_lpss.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_lpss.c
+++ linux-pm/drivers/acpi/acpi_lpss.c
@@ -1034,34 +1034,31 @@  static int acpi_lpss_resume(struct devic
 }
 
 #ifdef CONFIG_PM_SLEEP
-static int acpi_lpss_do_suspend_late(struct device *dev)
+static int acpi_lpss_suspend_late(struct device *dev)
 {
+	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
 	int ret;
 
 	if (dev_pm_smart_suspend_and_suspended(dev))
 		return 0;
 
 	ret = pm_generic_suspend_late(dev);
-	return ret ? ret : acpi_lpss_suspend(dev, device_may_wakeup(dev));
-}
+	if (ret)
+		return ret;
 
-static int acpi_lpss_suspend_late(struct device *dev)
-{
-	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
-
-	if (pdata->dev_desc->resume_from_noirq)
-		return 0;
+	if (!pdata->dev_desc->resume_from_noirq)
+		return acpi_lpss_suspend(dev, device_may_wakeup(dev));
 
-	return acpi_lpss_do_suspend_late(dev);
+	return 0;
 }
 
 static int acpi_lpss_suspend_noirq(struct device *dev)
 {
 	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
-	int ret;
 
-	if (pdata->dev_desc->resume_from_noirq) {
-		ret = acpi_lpss_do_suspend_late(dev);
+	if (!dev_pm_smart_suspend_and_suspended(dev) &&
+	    pdata->dev_desc->resume_from_noirq) {
+		int ret = acpi_lpss_suspend(dev, device_may_wakeup(dev));
 		if (ret)
 			return ret;
 	}
Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -1069,7 +1069,7 @@  EXPORT_SYMBOL_GPL(acpi_subsys_suspend);
  * Carry out the generic late suspend procedure for @dev and use ACPI to put
  * it into a low-power state during system transition into a sleep state.
  */
-int acpi_subsys_suspend_late(struct device *dev)
+static int acpi_subsys_suspend_late(struct device *dev)
 {
 	int ret;
 
@@ -1079,7 +1079,6 @@  int acpi_subsys_suspend_late(struct devi
 	ret = pm_generic_suspend_late(dev);
 	return ret ? ret : acpi_dev_suspend(dev, device_may_wakeup(dev));
 }
-EXPORT_SYMBOL_GPL(acpi_subsys_suspend_late);
 
 /**
  * acpi_subsys_suspend_noirq - Run the device driver's "noirq" suspend callback.
Index: linux-pm/include/linux/acpi.h
===================================================================
--- linux-pm.orig/include/linux/acpi.h
+++ linux-pm/include/linux/acpi.h
@@ -916,7 +916,6 @@  static inline int acpi_dev_pm_attach(str
 int acpi_dev_suspend_late(struct device *dev);
 int acpi_subsys_prepare(struct device *dev);
 void acpi_subsys_complete(struct device *dev);
-int acpi_subsys_suspend_late(struct device *dev);
 int acpi_subsys_suspend_noirq(struct device *dev);
 int acpi_subsys_suspend(struct device *dev);
 int acpi_subsys_freeze(struct device *dev);
@@ -924,7 +923,6 @@  int acpi_subsys_freeze(struct device *de
 static inline int acpi_dev_resume_early(struct device *dev) { return 0; }
 static inline int acpi_subsys_prepare(struct device *dev) { return 0; }
 static inline void acpi_subsys_complete(struct device *dev) {}
-static inline int acpi_subsys_suspend_late(struct device *dev) { return 0; }
 static inline int acpi_subsys_suspend_noirq(struct device *dev) { return 0; }
 static inline int acpi_subsys_suspend(struct device *dev) { return 0; }
 static inline int acpi_subsys_freeze(struct device *dev) { return 0; }