linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	linux-nvme <linux-nvme@lists.infradead.org>,
	Keith Busch <kbusch@kernel.org>,
	Mario Limonciello <Mario.Limonciello@dell.com>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Keith Busch <keith.busch@intel.com>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	Linux PM <linux-pm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Rajat Jain <rajatja@google.com>,
	Linux PCI <linux-pci@vger.kernel.org>
Subject: Re: [PATCH v2 2/2] nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled
Date: Thu, 8 Aug 2019 16:47:45 +0200	[thread overview]
Message-ID: <CAJZ5v0h=nz8yXwOOGBUB9m1GtJPOqBwtNK7zXPNMJjzPhMWd9w@mail.gmail.com> (raw)
In-Reply-To: <20190808134356.GF151852@google.com>

On Thu, Aug 8, 2019 at 3:43 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Aug 08, 2019 at 12:10:06PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > One of the modifications made by commit d916b1be94b6 ("nvme-pci: use
> > host managed power state for suspend") was adding a pci_save_state()
> > call to nvme_suspend() in order to prevent the PCI bus-level PM from
> > being applied to the suspended NVMe devices, but if ASPM is not
> > enabled for the target NVMe device, that causes its PCIe link to stay
> > up and the platform may not be able to get into its optimum low-power
> > state because of that.
> >
> > For example, if ASPM is disabled for the NVMe drive (PC401 NVMe SK
> > hynix 256GB) in my Dell XPS13 9380, leaving it in D0 during
> > suspend-to-idle prevents the SoC from reaching package idle states
> > deeper than PC3, which is way insufficient for system suspend.
>
> Just curious: I assume the SoC you reference is some part of the NVMe
> drive?

No, the SoC is what contains the Intel processor and PCH (formerly "chipset").

> > To address this shortcoming, make nvme_suspend() check if ASPM is
> > enabled for the target device and fall back to full device shutdown
> > and PCI bus-level PM if that is not the case.
> >
> > Fixes: d916b1be94b6 ("nvme-pci: use host managed power state for suspend")
> > Link: https://lore.kernel.org/linux-pm/2763495.NmdaWeg79L@kreacher/T/#t
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > -> v2:
> >   * Move the PCI/PCIe ASPM changes to a separate patch.
> >   * Do not add a redundant ndev->last_ps == U32_MAX check in nvme_suspend().
> >
> > ---
> >  drivers/nvme/host/pci.c |   13 ++++++++++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > Index: linux-pm/drivers/nvme/host/pci.c
> > ===================================================================
> > --- linux-pm.orig/drivers/nvme/host/pci.c
> > +++ linux-pm/drivers/nvme/host/pci.c
> > @@ -2846,7 +2846,7 @@ static int nvme_resume(struct device *de
> >       struct nvme_dev *ndev = pci_get_drvdata(to_pci_dev(dev));
> >       struct nvme_ctrl *ctrl = &ndev->ctrl;
> >
> > -     if (pm_resume_via_firmware() || !ctrl->npss ||
> > +     if (ndev->last_ps == U32_MAX ||
> >           nvme_set_power_state(ctrl, ndev->last_ps) != 0)
> >               nvme_reset_ctrl(ctrl);
> >       return 0;
> > @@ -2859,6 +2859,8 @@ static int nvme_suspend(struct device *d
> >       struct nvme_ctrl *ctrl = &ndev->ctrl;
> >       int ret = -EBUSY;
> >
> > +     ndev->last_ps = U32_MAX;
> > +
> >       /*
> >        * The platform does not remove power for a kernel managed suspend so
> >        * use host managed nvme power settings for lowest idle power if
> > @@ -2866,8 +2868,14 @@ static int nvme_suspend(struct device *d
> >        * shutdown.  But if the firmware is involved after the suspend or the
> >        * device does not support any non-default power states, shut down the
> >        * device fully.
> > +      *
> > +      * If ASPM is not enabled for the device, shut down the device and allow
> > +      * the PCI bus layer to put it into D3 in order to take the PCIe link
> > +      * down, so as to allow the platform to achieve its minimum low-power
> > +      * state (which may not be possible if the link is up).
> >        */
> > -     if (pm_suspend_via_firmware() || !ctrl->npss) {
> > +     if (pm_suspend_via_firmware() || !ctrl->npss ||
> > +         !pcie_aspm_enabled_mask(pdev)) {
>
> This seems like a layering violation, in the sense that ASPM is
> supposed to be hardware-autonomous and invisible to software.

But software has to enable it.

If it is not enabled, it will not be used, and that's what the check is about.

> IIUC the NVMe device will go to the desired package idle state if the
> link is in L0s or L1, but not if the link is in L0.  I don't
> understand that connection; AFAIK that would be something outside the
> scope of the PCIe spec.

Yes, it is outside of the PCIe spec.

No, this is not about the NVMe device, it is about the Intel SoC
(System-on-a-Chip) the platform is based on.

The background really is commit d916b1be94b6 and its changelog is kind
of misleading, unfortunately.  What it did, among other things, was to
cause the NVMe driver to prevent the PCI bus type from applying the
standard PCI PM to the devices handled by it in the suspend-to-idle
flow.  The reason for doing that was a (reportedly) widespread failure
to take the PCIe link down during D0 -> D3hot transitions of NVMe
devices, which then prevented the platform from going into a deep
enough low-power state while suspended (because it was not sure
whether or not the NVMe device was really "sufficiently" inactive).
[I guess I should mention that in the changelog of the $subject
patch.]  So the idea was to put the (NVMe) device into a low-power
state internally and then let ASPM take care of the PCIe link.

Of course, that can only work if ASPM is enabled at all for the device
in question, even though it may not be sufficient as you say below.

> The spec (PCIe r5.0, sec 5.4.1.1.1 for L0s, 5.4.1.2.1 for L1) is
> careful to say that when the conditions are right, devices "should"
> enter L0s but it is never mandatory, or "may" enter L1.
>
> And this patch assumes that if ASPM is enabled, the link will
> eventually go to L0s or L1.

No, it doesn't.

It avoids failure in the case in which it is guaranteed to happen
(disabled ASPM) and that's it.

> Because the PCIe spec doesn't mandate that transition, I think this patch makes the
> driver dependent on device-specific behavior.

IMO not really.  It just adds a "don't do it if you are going to fail"
kind of check.

>
> >               nvme_dev_disable(ndev, true);
> >               return 0;
> >       }
> > @@ -2880,7 +2888,6 @@ static int nvme_suspend(struct device *d
> >           ctrl->state != NVME_CTRL_ADMIN_ONLY)
> >               goto unfreeze;
> >
> > -     ndev->last_ps = 0;
> >       ret = nvme_get_power_state(ctrl, &ndev->last_ps);
> >       if (ret < 0)
> >               goto unfreeze;
> >
> >
> >

  reply	other threads:[~2019-08-08 14:48 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-25  9:51 [Regression] Commit "nvme/pci: Use host managed power state for suspend" has problems Rafael J. Wysocki
2019-07-25 14:02 ` Kai-Heng Feng
2019-07-25 16:23   ` Mario.Limonciello
2019-07-25 17:03     ` Rafael J. Wysocki
2019-07-25 17:23       ` Mario.Limonciello
2019-07-25 18:20       ` Kai-Heng Feng
2019-07-25 19:09         ` Mario.Limonciello
2019-07-30 10:45       ` Rafael J. Wysocki
2019-07-30 14:41         ` Keith Busch
2019-07-30 17:14           ` Mario.Limonciello
2019-07-30 18:50             ` Kai-Heng Feng
2019-07-30 19:19               ` Keith Busch
2019-07-30 21:05                 ` Mario.Limonciello
2019-07-30 21:31                   ` Keith Busch
2019-07-31 21:25                     ` Rafael J. Wysocki
2019-07-31 22:19                       ` Keith Busch
2019-07-31 22:33                         ` Rafael J. Wysocki
2019-08-01  9:05                           ` Kai-Heng Feng
2019-08-01 17:29                             ` Rafael J. Wysocki
2019-08-01 19:05                               ` Mario.Limonciello
2019-08-01 22:26                                 ` Rafael J. Wysocki
2019-08-02 10:55                                   ` Kai-Heng Feng
2019-08-02 11:04                                     ` Rafael J. Wysocki
2019-08-05 19:13                                       ` Kai-Heng Feng
2019-08-05 21:28                                         ` Rafael J. Wysocki
2019-08-06 14:02                                           ` Mario.Limonciello
2019-08-06 15:00                                             ` Rafael J. Wysocki
2019-08-07 10:29                                               ` Rafael J. Wysocki
2019-08-01 20:22                             ` Keith Busch
2019-08-07  9:48                         ` Rafael J. Wysocki
2019-08-07 10:45                           ` Christoph Hellwig
2019-08-07 10:54                             ` Rafael J. Wysocki
2019-08-07  9:53                         ` [PATCH] nvme-pci: Do not prevent PCI bus-level PM from being used Rafael J. Wysocki
2019-08-07 10:14                           ` Rafael J. Wysocki
2019-08-07 10:43                           ` Christoph Hellwig
2019-08-07 14:37                           ` Keith Busch
2019-08-08  8:36                         ` [PATCH] nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled Rafael J. Wysocki
2019-08-08  8:48                           ` Christoph Hellwig
2019-08-08  9:06                             ` Rafael J. Wysocki
2019-08-08 10:03                         ` [PATCH v2 0/2] " Rafael J. Wysocki
2019-08-08 10:06                           ` [PATCH v2 1/2] PCI: PCIe: ASPM: Introduce pcie_aspm_enabled_mask() Rafael J. Wysocki
2019-08-08 13:15                             ` Bjorn Helgaas
2019-08-08 14:48                               ` Rafael J. Wysocki
2019-08-08 10:10                           ` [PATCH v2 2/2] nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled Rafael J. Wysocki
2019-08-08 13:43                             ` Bjorn Helgaas
2019-08-08 14:47                               ` Rafael J. Wysocki [this message]
2019-08-08 17:06                                 ` Rafael J. Wysocki
2019-08-08 18:39                                 ` Bjorn Helgaas
2019-08-08 20:01                                   ` Keith Busch
2019-08-08 20:05                                   ` Mario.Limonciello
2019-08-08 20:41                                   ` Rafael J. Wysocki
2019-08-09  4:47                                     ` Bjorn Helgaas
2019-08-09  8:04                                       ` Rafael J. Wysocki
2019-08-08 21:51                         ` [PATCH v3 0/2] " Rafael J. Wysocki
2019-08-08 21:55                           ` [PATCH v3 1/2] PCI: PCIe: ASPM: Introduce pcie_aspm_enabled() Rafael J. Wysocki
2019-08-09  4:50                             ` Bjorn Helgaas
2019-08-09  8:00                               ` Rafael J. Wysocki
2019-10-07 22:34                             ` Bjorn Helgaas
2019-10-08  9:27                               ` Rafael J. Wysocki
2019-10-08 21:16                                 ` Bjorn Helgaas
2019-10-08 22:54                                   ` Rafael J. Wysocki
2019-10-09 12:49                                     ` Bjorn Helgaas
2019-08-08 21:58                           ` [PATCH v3 2/2] nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled Rafael J. Wysocki
2019-08-08 22:13                           ` [PATCH v3 0/2] " Keith Busch
2019-08-09  8:05                             ` Rafael J. Wysocki
2019-08-09 14:52                               ` Keith Busch
2019-07-25 16:59   ` [Regression] Commit "nvme/pci: Use host managed power state for suspend" has problems Rafael J. Wysocki
2019-07-25 14:52 ` Keith Busch
2019-07-25 19:48   ` Rafael J. Wysocki
2019-07-25 19:52     ` Keith Busch
2019-07-25 20:02       ` Rafael J. Wysocki
2019-07-26 14:02         ` Kai-Heng Feng
2019-07-27 12:55           ` Rafael J. Wysocki
2019-07-29 15:51             ` Mario.Limonciello
2019-07-29 22:05               ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0h=nz8yXwOOGBUB9m1GtJPOqBwtNK7zXPNMJjzPhMWd9w@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=Mario.Limonciello@dell.com \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rajatja@google.com \
    --cc=rjw@rjwysocki.net \
    --cc=sagi@grimberg.me \
    --subject='Re: [PATCH v2 2/2] nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).