All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme/pci: default to simple suspend
@ 2022-02-01 16:50 Keith Busch
  2022-02-01 17:49 ` Vidya Sagar
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Keith Busch @ 2022-02-01 16:50 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, sagi, Keith Busch, Rafael Wysocki, Vidya Sagar

There is no complete set of attributes a driver can check to know if
nvme power management is the correct thing to do in a runtime suspend
situation. Every attempt so far to optimize idle power consumption and
resume latency for a particular platform just leads to regressions
elsewhere.

Default to the simple shutdown since it is the historically safest
option, and provide a user parameter to override it if the user knows
it's safe to use for their platform.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215467
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/nvme/host/pci.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d8585df2c2fd..7e25cdef09a2 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -103,6 +103,10 @@ static bool noacpi;
 module_param(noacpi, bool, 0444);
 MODULE_PARM_DESC(noacpi, "disable acpi bios quirks");
 
+static bool pwr_mgmt;
+module_param(pwr_mgmt, bool, 0444);
+MODULE_PARM_DESC(pwr_mgmt, "use nvme power management for runtime suspend");
+
 struct nvme_dev;
 struct nvme_queue;
 
@@ -3094,7 +3098,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		dev_info(&pdev->dev,
 			 "platform quirk: setting simple suspend\n");
 		quirks |= NVME_QUIRK_SIMPLE_SUSPEND;
-	}
+	} else if (!pwr_mgmt)
+		quirks |= NVME_QUIRK_SIMPLE_SUSPEND;
 
 	/*
 	 * Double check that our mempool alloc size will cover the biggest
-- 
2.25.4



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-01 16:50 [PATCH] nvme/pci: default to simple suspend Keith Busch
@ 2022-02-01 17:49 ` Vidya Sagar
  2022-02-02  7:55 ` Christoph Hellwig
  2022-04-11 13:58 ` Manivannan Sadhasivam
  2 siblings, 0 replies; 11+ messages in thread
From: Vidya Sagar @ 2022-02-01 17:49 UTC (permalink / raw)
  To: Keith Busch, linux-nvme; +Cc: hch, sagi, Rafael Wysocki

Tested-by: Vidya Sagar <vidyas@nvidia.com>

Tested on Tegra194 platform with a Phison E12 NVMe drive.

On 2/1/2022 10:20 PM, Keith Busch wrote:
> External email: Use caution opening links or attachments
> 
> 
> There is no complete set of attributes a driver can check to know if
> nvme power management is the correct thing to do in a runtime suspend
> situation. Every attempt so far to optimize idle power consumption and
> resume latency for a particular platform just leads to regressions
> elsewhere.
> 
> Default to the simple shutdown since it is the historically safest
> option, and provide a user parameter to override it if the user knows
> it's safe to use for their platform.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215467
> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
> Cc: Vidya Sagar <vidyas@nvidia.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>   drivers/nvme/host/pci.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index d8585df2c2fd..7e25cdef09a2 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -103,6 +103,10 @@ static bool noacpi;
>   module_param(noacpi, bool, 0444);
>   MODULE_PARM_DESC(noacpi, "disable acpi bios quirks");
> 
> +static bool pwr_mgmt;
> +module_param(pwr_mgmt, bool, 0444);
> +MODULE_PARM_DESC(pwr_mgmt, "use nvme power management for runtime suspend");
> +
>   struct nvme_dev;
>   struct nvme_queue;
> 
> @@ -3094,7 +3098,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>                  dev_info(&pdev->dev,
>                           "platform quirk: setting simple suspend\n");
>                  quirks |= NVME_QUIRK_SIMPLE_SUSPEND;
> -       }
> +       } else if (!pwr_mgmt)
> +               quirks |= NVME_QUIRK_SIMPLE_SUSPEND;
> 
>          /*
>           * Double check that our mempool alloc size will cover the biggest
> --
> 2.25.4
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-01 16:50 [PATCH] nvme/pci: default to simple suspend Keith Busch
  2022-02-01 17:49 ` Vidya Sagar
@ 2022-02-02  7:55 ` Christoph Hellwig
  2022-02-02 16:03   ` Keith Busch
  2022-04-11 13:58 ` Manivannan Sadhasivam
  2 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2022-02-02  7:55 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme, hch, sagi, Rafael Wysocki, Vidya Sagar

On Tue, Feb 01, 2022 at 08:50:06AM -0800, Keith Busch wrote:
> There is no complete set of attributes a driver can check to know if
> nvme power management is the correct thing to do in a runtime suspend
> situation. Every attempt so far to optimize idle power consumption and
> resume latency for a particular platform just leads to regressions
> elsewhere.
> 
> Default to the simple shutdown since it is the historically safest
> option, and provide a user parameter to override it if the user knows
> it's safe to use for their platform.

Sigh.  The platforms really should be asking for a explicit D3cold if
they need one..


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-02  7:55 ` Christoph Hellwig
@ 2022-02-02 16:03   ` Keith Busch
  2022-02-04  7:10     ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2022-02-02 16:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvme, sagi, Rafael Wysocki, Vidya Sagar

On Wed, Feb 02, 2022 at 08:55:02AM +0100, Christoph Hellwig wrote:
> On Tue, Feb 01, 2022 at 08:50:06AM -0800, Keith Busch wrote:
> > There is no complete set of attributes a driver can check to know if
> > nvme power management is the correct thing to do in a runtime suspend
> > situation. Every attempt so far to optimize idle power consumption and
> > resume latency for a particular platform just leads to regressions
> > elsewhere.
> > 
> > Default to the simple shutdown since it is the historically safest
> > option, and provide a user parameter to override it if the user knows
> > it's safe to use for their platform.
> 
> Sigh.  The platforms really should be asking for a explicit D3cold if
> they need one..

It's too late now, but perhaps the new property should have been
inverted since preparing for D3 was the previous default behavior;
platforms could have instead explicitly asked for "no-D3" if they wanted
it. That would have been easier for backward compatibility.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-02 16:03   ` Keith Busch
@ 2022-02-04  7:10     ` Christoph Hellwig
  2022-02-07 16:06       ` Keith Busch
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2022-02-04  7:10 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, linux-nvme, sagi, Rafael Wysocki, Vidya Sagar

On Wed, Feb 02, 2022 at 08:03:34AM -0800, Keith Busch wrote:
> On Wed, Feb 02, 2022 at 08:55:02AM +0100, Christoph Hellwig wrote:
> > On Tue, Feb 01, 2022 at 08:50:06AM -0800, Keith Busch wrote:
> > > There is no complete set of attributes a driver can check to know if
> > > nvme power management is the correct thing to do in a runtime suspend
> > > situation. Every attempt so far to optimize idle power consumption and
> > > resume latency for a particular platform just leads to regressions
> > > elsewhere.
> > > 
> > > Default to the simple shutdown since it is the historically safest
> > > option, and provide a user parameter to override it if the user knows
> > > it's safe to use for their platform.
> > 
> > Sigh.  The platforms really should be asking for a explicit D3cold if
> > they need one..
> 
> It's too late now, but perhaps the new property should have been
> inverted since preparing for D3 was the previous default behavior;
> platforms could have instead explicitly asked for "no-D3" if they wanted
> it. That would have been easier for backward compatibility.

I'd really prefer to sort this out at the platform level.  We can't work
around broken platforms in nvme forever.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-04  7:10     ` Christoph Hellwig
@ 2022-02-07 16:06       ` Keith Busch
  2022-02-08 14:37         ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2022-02-07 16:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvme, sagi, Rafael Wysocki, Vidya Sagar

On Fri, Feb 04, 2022 at 08:10:12AM +0100, Christoph Hellwig wrote:
> On Wed, Feb 02, 2022 at 08:03:34AM -0800, Keith Busch wrote:
> > On Wed, Feb 02, 2022 at 08:55:02AM +0100, Christoph Hellwig wrote:
> > > On Tue, Feb 01, 2022 at 08:50:06AM -0800, Keith Busch wrote:
> > > > There is no complete set of attributes a driver can check to know if
> > > > nvme power management is the correct thing to do in a runtime suspend
> > > > situation. Every attempt so far to optimize idle power consumption and
> > > > resume latency for a particular platform just leads to regressions
> > > > elsewhere.
> > > > 
> > > > Default to the simple shutdown since it is the historically safest
> > > > option, and provide a user parameter to override it if the user knows
> > > > it's safe to use for their platform.
> > > 
> > > Sigh.  The platforms really should be asking for a explicit D3cold if
> > > they need one..
> > 
> > It's too late now, but perhaps the new property should have been
> > inverted since preparing for D3 was the previous default behavior;
> > platforms could have instead explicitly asked for "no-D3" if they wanted
> > it. That would have been easier for backward compatibility.
> 
> I'd really prefer to sort this out at the platform level.  We can't work
> around broken platforms in nvme forever.

I agree, but I'm not sure how to get everyone aligned.

How about this to resolve the regressions: if the platform doesn't
provide StorageD3Enable property, can we just default to the simple
shutdown method? We'd only use the nvme power management capabilities if
the platform explicity says it doesn't want D3, making the default the
same as the legacy behavior.

---
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d8585df2c2fd..edd8f7dae77f 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -20,6 +20,7 @@
 #include <linux/mutex.h>
 #include <linux/once.h>
 #include <linux/pci.h>
+#include <linux/property.h>
 #include <linux/suspend.h>
 #include <linux/t10-pi.h>
 #include <linux/types.h>
@@ -3086,7 +3087,9 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	quirks |= check_vendor_combination_bug(pdev);
 
-	if (!noacpi && acpi_storage_d3(&pdev->dev)) {
+	if (!noacpi && (acpi_storage_d3(&pdev->dev) ||
+			!device_property_present(&pdev->dev,
+						 "StorageD3Enable"))) {
 		/*
 		 * Some systems use a bios work around to ask for D3 on
 		 * platforms that support kernel managed suspend.
--


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-07 16:06       ` Keith Busch
@ 2022-02-08 14:37         ` Christoph Hellwig
  2022-02-08 17:14           ` Keith Busch
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2022-02-08 14:37 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, linux-nvme, sagi, Rafael Wysocki, Vidya Sagar

On Mon, Feb 07, 2022 at 08:06:55AM -0800, Keith Busch wrote:
> > I'd really prefer to sort this out at the platform level.  We can't work
> > around broken platforms in nvme forever.
> 
> I agree, but I'm not sure how to get everyone aligned.
> 
> How about this to resolve the regressions: if the platform doesn't
> provide StorageD3Enable property, can we just default to the simple
> shutdown method? We'd only use the nvme power management capabilities if
> the platform explicity says it doesn't want D3, making the default the
> same as the legacy behavior.

I don't think this will work, as most older platforms just won't have
that attribute at all, and non-ACPI platforms most certainly won't have
it.  We'll need more quirks in the core PCI/PM code like we did for the
AMD mobile platforms.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-08 14:37         ` Christoph Hellwig
@ 2022-02-08 17:14           ` Keith Busch
  2022-02-09  7:48             ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2022-02-08 17:14 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvme, sagi, Rafael Wysocki, Vidya Sagar

On Tue, Feb 08, 2022 at 03:37:51PM +0100, Christoph Hellwig wrote:
> On Mon, Feb 07, 2022 at 08:06:55AM -0800, Keith Busch wrote:
> > > I'd really prefer to sort this out at the platform level.  We can't work
> > > around broken platforms in nvme forever.
> > 
> > I agree, but I'm not sure how to get everyone aligned.
> > 
> > How about this to resolve the regressions: if the platform doesn't
> > provide StorageD3Enable property, can we just default to the simple
> > shutdown method? We'd only use the nvme power management capabilities if
> > the platform explicity says it doesn't want D3, making the default the
> > same as the legacy behavior.
> 
> I don't think this will work, as most older platforms just won't have
> that attribute at all, and non-ACPI platforms most certainly won't have
> it.  We'll need more quirks in the core PCI/PM code like we did for the
> AMD mobile platforms.

 old platform + old kernel -> simple suspend

 old platform + new kernel -> nvme pm

This inevitably introduced regressions. Why should we need to quirk old
platforms for a feature that didn't even exist when they were made?
Shouldn't the quirk be the other way around?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-08 17:14           ` Keith Busch
@ 2022-02-09  7:48             ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2022-02-09  7:48 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, linux-nvme, sagi, Rafael Wysocki, Vidya Sagar

On Tue, Feb 08, 2022 at 09:14:21AM -0800, Keith Busch wrote:
> > I don't think this will work, as most older platforms just won't have
> > that attribute at all, and non-ACPI platforms most certainly won't have
> > it.  We'll need more quirks in the core PCI/PM code like we did for the
> > AMD mobile platforms.
> 
>  old platform + old kernel -> simple suspend
> 
>  old platform + new kernel -> nvme pm
> 
> This inevitably introduced regressions. Why should we need to quirk old
> platforms for a feature that didn't even exist when they were made?
> Shouldn't the quirk be the other way around?

Because they are broken?  If we do get a suspend request that does not
explicitly asks for the device do be disabled we need to be able to rely
on that.  Where we is not just nvme, but any driver.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-02-01 16:50 [PATCH] nvme/pci: default to simple suspend Keith Busch
  2022-02-01 17:49 ` Vidya Sagar
  2022-02-02  7:55 ` Christoph Hellwig
@ 2022-04-11 13:58 ` Manivannan Sadhasivam
  2022-04-23  5:49   ` Christoph Hellwig
  2 siblings, 1 reply; 11+ messages in thread
From: Manivannan Sadhasivam @ 2022-04-11 13:58 UTC (permalink / raw)
  To: hch; +Cc: linux-nvme, hch, sagi, Rafael Wysocki, Vidya Sagar, kbusch

Hi Christoph,

On Tue, Feb 01, 2022 at 08:50:06AM -0800, Keith Busch wrote:
> There is no complete set of attributes a driver can check to know if
> nvme power management is the correct thing to do in a runtime suspend
> situation. Every attempt so far to optimize idle power consumption and
> resume latency for a particular platform just leads to regressions
> elsewhere.
> 
> Default to the simple shutdown since it is the historically safest
> option, and provide a user parameter to override it if the user knows
> it's safe to use for their platform.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215467
> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
> Cc: Vidya Sagar <vidyas@nvidia.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>

I was looking into the nvme simple shutdown path for the Qualcomm platforms and
it looks to me that this patch from Keith is the way to go. I do agree with your
resistance on adding the platform level support to the nvme driver, but there is
no good way of working around it in the platform level.

PCI core only accepts the quirks for the host devices that could be passed onto
the PCI device drivers like this one. In this case, this is not a quirk but
actually an aggressive power saving feature (atleast on the Qcom platforms).
Moreover, adding a flag to the PCI bus will make it applicable to all the
child devices of the RC/bridge and that would be wrong.

In our case, the same power saving feature is not applicable to all PCI devices
like WLAN for an example.

Then if we try to add the platform specific support in nvme driver using the
of_machine_is_compatible() API, it will get piled up over the time. So just
using the simple shutdown as the default suspend choice and using the module
parameter to override it makes much sense to me.

Please let me know what you think.

Thanks,
Mani

> ---
>  drivers/nvme/host/pci.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index d8585df2c2fd..7e25cdef09a2 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -103,6 +103,10 @@ static bool noacpi;
>  module_param(noacpi, bool, 0444);
>  MODULE_PARM_DESC(noacpi, "disable acpi bios quirks");
>  
> +static bool pwr_mgmt;
> +module_param(pwr_mgmt, bool, 0444);
> +MODULE_PARM_DESC(pwr_mgmt, "use nvme power management for runtime suspend");
> +
>  struct nvme_dev;
>  struct nvme_queue;
>  
> @@ -3094,7 +3098,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		dev_info(&pdev->dev,
>  			 "platform quirk: setting simple suspend\n");
>  		quirks |= NVME_QUIRK_SIMPLE_SUSPEND;
> -	}
> +	} else if (!pwr_mgmt)
> +		quirks |= NVME_QUIRK_SIMPLE_SUSPEND;
>  
>  	/*
>  	 * Double check that our mempool alloc size will cover the biggest
> -- 
> 2.25.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] nvme/pci: default to simple suspend
  2022-04-11 13:58 ` Manivannan Sadhasivam
@ 2022-04-23  5:49   ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2022-04-23  5:49 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: hch, linux-nvme, sagi, Rafael Wysocki, Vidya Sagar, kbusch, linux-pci

On Mon, Apr 11, 2022 at 07:28:50PM +0530, Manivannan Sadhasivam wrote:
> PCI core only accepts the quirks for the host devices that could be passed onto
> the PCI device drivers like this one. In this case, this is not a quirk but
> actually an aggressive power saving feature (atleast on the Qcom platforms).
> Moreover, adding a flag to the PCI bus will make it applicable to all the
> child devices of the RC/bridge and that would be wrong.

As you correctly state it is not a device quirk.  It describes the
power management applied by the platform.  So we do need to communicate
it through the core PM and/or PCI code.  Please work with the relevant
maintainers.

> In our case, the same power saving feature is not applicable to all PCI devices
> like WLAN for an example.

This doesn't make sense.  Your plaform can't know what device is connected
to a given root port / slot.  So you might have different policies per
slot, but that has nothing to do with the Linux drivers for given devices.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-04-23  5:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-01 16:50 [PATCH] nvme/pci: default to simple suspend Keith Busch
2022-02-01 17:49 ` Vidya Sagar
2022-02-02  7:55 ` Christoph Hellwig
2022-02-02 16:03   ` Keith Busch
2022-02-04  7:10     ` Christoph Hellwig
2022-02-07 16:06       ` Keith Busch
2022-02-08 14:37         ` Christoph Hellwig
2022-02-08 17:14           ` Keith Busch
2022-02-09  7:48             ` Christoph Hellwig
2022-04-11 13:58 ` Manivannan Sadhasivam
2022-04-23  5:49   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.