linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Rafael Wysocki <rafael.j.wysocki@intel.com>,
	Mario Limonciello <Mario.Limonciello@dell.com>,
	Keith Busch <kbusch@kernel.org>,
	Keith Busch <keith.busch@intel.com>, Jens Axboe <axboe@fb.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	linux-nvme <linux-nvme@lists.infradead.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] nvme-pci: Use non-operational power state instead of D3 on Suspend-to-Idle
Date: Thu, 9 May 2019 22:48:00 +0200	[thread overview]
Message-ID: <CAJZ5v0iezuuXeHAuEPbJ2fAcbmaySCAofU+yZ-j-WuN6O+yq0A@mail.gmail.com> (raw)
In-Reply-To: <20190509092514.GA18598@lst.de>

On Thu, May 9, 2019 at 11:25 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, May 09, 2019 at 11:19:37AM +0200, Rafael J. Wysocki wrote:
> > Right, the choice of the target system state has already been made
> > when their callbacks get invoked (and it has been made by user space,
> > not by the platform).
>
> From a previous discussion I remember the main problem here is that
> a lot of consumer NVMe use more power when put into D3hot than just
> letting the device itself manage the power state transitions themselves.
> Based on this patch there also might be some other device that want
> an explicit power state transition from the host, but still not be
> put into D3hot.
>
> The avoid D3hot at all cost thing seems to be based on the Windows
> broken^H^H^H^H^H^Hmodern standby principles.  So for platforms that
> follow the modern standby model we need to avoid putting NVMe devices
> that support power management into D3hot somehow.  This patch doesa a
> few more things, but at least for the device where I was involved in
> the earlier discussion those are not needed, and from the Linux
> point of view many of them seem wrong too.
>
> How do you think we best make that distinction?  Are the pm_ops
> enough if we don't use the simple version?

First, I think that it is instructive to look at what happens without
the patch: nvme_suspend() gets called by pci_pm_suspend() (which
basically causes the device to be "stopped" IIUC) and then
pci_pm_suspend_noirq() is expected to put the device into the right
power state through pci_prepare_to_sleep().  In theory, this should
work for both S2R and S2I as long as the standard PCIe PM plus
possibly ACPI PM is sufficient for the device.  [Of course, the
platform firmware invoked at the last stage of S2R can "fix up" things
to reduce power further, but that should not be necessary if all is
handled properly up to this point.]

The claim in the patch changelog is that one design choice in Windows
related to "Modern Standby" has caused our default PCI PM to not apply
to NVMe devices in general (or to apply to them, but without much
effect, which is practically equivalent IMO).  This is not about a
"different paradigm" (as Mario put it) or a different type of system
suspend, but about the default PCI PM being basically useless for
those devices at least in some configurations.

And BTW, the same problem would have affected PM-runtime, had it been
supported by the nvme driver, because Linux uses the combination of
the standard PCIe PM and ACPI PM for PM-runtime too, and the
"paradigm" in there is pretty much the same as for S2I, so let's not
confuse things, pretty please.

All of this means that the driver needs to override the default PCI PM
like in the patch that Keith has just posted.  Unfortunately, it looks
like the "suspend via firmware" check needs to be there, because the
platform firmware doing S3 on some platforms may get confused by the
custom PM in the driver.

  reply	other threads:[~2019-05-09 20:48 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-08 18:59 [PATCH] nvme-pci: Use non-operational power state instead of D3 on Suspend-to-Idle Kai-Heng Feng
2019-05-08 19:15 ` Chaitanya Kulkarni
2019-05-08 19:16 ` Keith Busch
2019-05-08 19:30   ` Kai-Heng Feng
2019-05-08 19:38     ` Mario.Limonciello
2019-05-08 19:51       ` Christoph Hellwig
2019-05-08 20:28         ` Mario.Limonciello
2019-05-09  6:12           ` Christoph Hellwig
2019-05-09  6:48             ` Kai-Heng Feng
2019-05-09  6:52               ` Christoph Hellwig
2019-05-09  9:19                 ` Rafael J. Wysocki
2019-05-09  9:25                   ` Christoph Hellwig
2019-05-09 20:48                     ` Rafael J. Wysocki [this message]
2019-05-09  9:07               ` Rafael J. Wysocki
2019-05-09  9:42                 ` Kai-Heng Feng
2019-05-09  9:56                   ` Christoph Hellwig
2019-05-09 10:28                     ` Kai-Heng Feng
2019-05-09 10:31                       ` Christoph Hellwig
2019-05-09 11:59                         ` Kai-Heng Feng
2019-05-09 18:57                           ` Mario.Limonciello
2019-05-09 19:28                             ` Keith Busch
2019-05-09 20:54                               ` Rafael J. Wysocki
2019-05-09 21:16                                 ` Keith Busch
2019-05-09 21:39                                   ` Rafael J. Wysocki
2019-05-09 21:37                               ` Mario.Limonciello
2019-05-09 21:54                                 ` Keith Busch
2019-05-09 22:19                                   ` Mario.Limonciello
2019-05-10  6:05                                     ` Kai-Heng Feng
2019-05-10  8:23                                       ` Rafael J. Wysocki
2019-05-10 13:52                                         ` Keith Busch
2019-05-10 15:15                                         ` Kai Heng Feng
2019-05-10 15:36                                           ` Keith Busch
2019-05-10 14:02                                       ` Keith Busch
2019-05-10 15:18                                         ` Kai Heng Feng
2019-05-10 15:49                                           ` hch
2019-05-10  5:30                               ` Christoph Hellwig
2019-05-10 13:51                                 ` Keith Busch
2019-05-09 16:20                       ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJZ5v0iezuuXeHAuEPbJ2fAcbmaySCAofU+yZ-j-WuN6O+yq0A@mail.gmail.com \
    --to=rafael@kernel.org \
    --cc=Mario.Limonciello@dell.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).