From: Bjorn Helgaas <helgaas@kernel.org>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
Linux PCI <linux-pci@vger.kernel.org>,
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
Keith Busch <keith.busch@intel.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Mika Westerberg <mika.westerberg@linux.intel.com>,
Linux PM <linux-pm@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/5] PCI / PM: Check for error when reading PME status
Date: Tue, 13 Aug 2019 20:15:09 -0500 [thread overview]
Message-ID: <20190814011509.GB206171@google.com> (raw)
In-Reply-To: <2341382.rHjnX2mYrU@kreacher>
On Wed, Aug 14, 2019 at 01:26:56AM +0200, Rafael J. Wysocki wrote:
> On Tuesday, August 6, 2019 3:36:38 PM CEST Bjorn Helgaas wrote:
> > On Mon, Aug 05, 2019 at 11:02:51PM +0200, Rafael J. Wysocki wrote:
> > > On Mon, Aug 5, 2019 at 10:52 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > >
> > > > pci_check_pme_status() reads the Power Management capability to determine
> > > > whether a device has generated a PME. The capability is in config space,
> > > > which is accessible in D0, D1, D2, and D3hot, but not in D3cold.
> > > >
> > > > If we call pci_check_pme_status() on a device that's in D3cold, config
> > > > reads fail and return ~0 data, which we erroneously interpreted as "the
> > > > device has generated a PME".
> > > >
> > > > 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
> > > > avoided many of these problems by avoiding pci_check_pme_status() if we
> > > > think the device is in D3cold. However, it is not a complete fix because
> > > > the device may go to D3cold after we check its power state but before
> > > > pci_check_pme_status() reads the Power Management Status Register.
> > > >
> > > > Return false ("device has not generated a PME") if we get an error response
> > > > reading the Power Management Status Register.
> > > >
> > > > Fixes: 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
> > > > Fixes: 71a83bd727cc ("PCI/PM: add runtime PM support to PCIe port")
> > > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > > > ---
> > > > drivers/pci/pci.c | 3 +++
> > > > 1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index 984171d40858..af6a97d7012b 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -2008,6 +2008,9 @@ bool pci_check_pme_status(struct pci_dev *dev)
> > > >
> > > > pmcsr_pos = dev->pm_cap + PCI_PM_CTRL;
> > > > pci_read_config_word(dev, pmcsr_pos, &pmcsr);
> > > > + if (pmcsr == (u16) PCI_ERROR_RESPONSE)
> > > > + return false;
> > >
> > > No, sorry.
> > >
> > > We tried that and it didn't work.
> > >
> > > pcie_pme_handle_request() depends on this returning "true" for all
> > > bits set, as from its perspective "device is not accessible" may very
> > > well mean "device may have signaled PME".
> >
> > Right, it's obviously wrong in the case of devices that advertise
> > D3cold in PME_Support, i.e., devices that can generate PME even with
> > main power off. Also, we may want to try to wake up devices if the
> > config read fails for a reason other than the device being in D3cold.
> >
> > What I don't like about the current code is that it checks
> > PCI_PM_CTRL_PME_STATUS in data that may be completely bogus.
>
> Whether or not the other bits in the register make sense doesn't
> matter here. Only the PME_STATUS bit matters.
Of course. It just relies on the implicit assumption that the bit in
the error response matches the PME_STATUS state that we want, which is
a little bit ugly.
> > Do you think it would be better to do something like this:
> >
> > pci_read_config_word(dev, pmcsr_pos, &pmcsr);
> > if (pmcsr == (u16) PCI_ERROR_RESPONSE) {
> > if (pci_pme_capable(dev, PCI_PM_CAP_PME_D3cold))
> > return true;
> > return false;
> > }
> >
> > or maybe this:
> >
> > pci_read_config_word(dev, pmcsr_pos, &pmcsr);
> > if (pmcsr == (u16) PCI_ERROR_RESPONSE)
> > return true;
>
> In this case it still would be prudent to check PME_ENABLE before
> returning true and so there is no practical difference between
> ERROR_RESPONSE and the valid data with PME_STATUS set.
>
> Except that in the ERROR_RESPONSE case we may as well avoid the
> PMCSR write which is not going to make a difference.
>
> > We should get PCI_ERROR_RESPONSE pretty reliably from devices in
> > D3cold, so the first possibility would cover that case.
> >
> > But since pci_check_pme_status() basically returns a hint ("true"
> > means a device *may* have generated a PME), and even if the hint is
> > wrong, the worst that happens is an unnecessary wakeup, maybe the
> > second possibility is safer.
> >
> > What do you think?
>
> So if you really want to avoid the PMCSR write in the ERROR_RESPONSE case,
> something like this can be done IMO:
>
> return false;
>
> /* Clear PME status. */
> - pmcsr |= PCI_PM_CTRL_PME_STATUS;
> if (pmcsr & PCI_PM_CTRL_PME_ENABLE) {
> + if (pmcsr == (u16) PCI_ERROR_RESPONSE)
> + return true;
> +
> /* Disable PME to avoid interrupt flood. */
> pmcsr &= ~PCI_PM_CTRL_PME_ENABLE;
> ret = true;
Agreed, that's the conclusion I came to as well. I wouldn't do this
just to avoid the config write, since as you mentioned that will get
dropped anyway. The reason I would consider this is as an example of
how drivers might think about validating data they read from devices.
Bjorn
next prev parent reply other threads:[~2019-08-14 1:15 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-05 20:52 [PATCH 0/5] PCI: Add PCI_ERROR_RESPONSE, check for errors Bjorn Helgaas
2019-08-05 20:52 ` [PATCH 1/5] PCI: Add PCI_ERROR_RESPONSE definition Bjorn Helgaas
2019-08-05 21:16 ` Rafael J. Wysocki
2019-08-05 20:52 ` [PATCH 2/5] PCI / PM: Return error when changing power state from D3cold Bjorn Helgaas
2019-08-05 21:15 ` Rafael J. Wysocki
2019-08-05 20:52 ` [PATCH 3/5] PCI / PM: Check for error when reading PME status Bjorn Helgaas
2019-08-05 21:02 ` Rafael J. Wysocki
2019-08-06 13:36 ` Bjorn Helgaas
2019-08-13 23:26 ` Rafael J. Wysocki
2019-08-14 1:15 ` Bjorn Helgaas [this message]
2019-08-05 20:52 ` [PATCH 4/5] PCI / PM: Check for error when reading Power State Bjorn Helgaas
2019-08-05 21:09 ` Rafael J. Wysocki
2019-08-09 22:01 ` Bjorn Helgaas
2019-08-13 22:59 ` Rafael J. Wysocki
2019-08-14 1:08 ` Bjorn Helgaas
2019-08-05 20:52 ` [PATCH 5/5] PCI / PM: Decode D3cold power state correctly Bjorn Helgaas
2019-08-05 21:14 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190814011509.GB206171@google.com \
--to=helgaas@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=mika.westerberg@linux.intel.com \
--cc=rafael@kernel.org \
--cc=rjw@rjwysocki.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).