linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pci: add pci_dev_is_alive API
@ 2021-05-25 12:59 Lambert Wang
  2021-05-25 13:20 ` Krzysztof Wilczyński
  2021-05-25 14:08 ` Bjorn Helgaas
  0 siblings, 2 replies; 8+ messages in thread
From: Lambert Wang @ 2021-05-25 12:59 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, Lambert Wang

Device drivers use this API to proactively check if the device
is alive or not. It is helpful for some PCI devices to detect
surprise removal and do recovery when Hotplug function is disabled.

Note: Device in power states larger than D0 is also treated not alive
by this function.

Signed-off-by: Lambert Wang <lambert.q.wang@gmail.com>
---
 drivers/pci/pci.c   | 23 +++++++++++++++++++++++
 include/linux/pci.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b717680377a9..8a7c039b1cd5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4659,6 +4659,29 @@ int pcie_flr(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pcie_flr);
 
+/**
+ * pci_dev_is_alive - check the pci device is alive or not
+ * @pdev: the PCI device
+ *
+ * Device drivers use this API to proactively check if the device
+ * is alive or not. It is helpful for some PCI devices to detect
+ * surprise removal and do recovery when Hotplug function is disabled.
+ *
+ * Note: Device in power state larger than D0 is also treated not alive
+ * by this function.
+ *
+ * Returns true if the device is alive.
+ */
+bool pci_dev_is_alive(struct pci_dev *pdev)
+{
+	u16 vendor;
+
+	pci_read_config_word(pdev, PCI_VENDOR_ID, &vendor);
+
+	return vendor == pdev->vendor;
+}
+EXPORT_SYMBOL(pci_dev_is_alive);
+
 static int pci_af_flr(struct pci_dev *dev, int probe)
 {
 	int pos;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c20211e59a57..2a3ba06a7347 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1227,6 +1227,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
 void pcie_print_link_status(struct pci_dev *dev);
 bool pcie_has_flr(struct pci_dev *dev);
 int pcie_flr(struct pci_dev *dev);
+bool pci_dev_is_alive(struct pci_dev *pdev);
 int __pci_reset_function_locked(struct pci_dev *dev);
 int pci_reset_function(struct pci_dev *dev);
 int pci_reset_function_locked(struct pci_dev *dev);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] pci: add pci_dev_is_alive API
  2021-05-25 12:59 [PATCH] pci: add pci_dev_is_alive API Lambert Wang
@ 2021-05-25 13:20 ` Krzysztof Wilczyński
  2021-05-26  6:12   ` Lambert Wang
  2021-05-25 14:08 ` Bjorn Helgaas
  1 sibling, 1 reply; 8+ messages in thread
From: Krzysztof Wilczyński @ 2021-05-25 13:20 UTC (permalink / raw)
  To: Lambert Wang; +Cc: Bjorn Helgaas, linux-pci, linux-kernel

Hi Lambert,

Thank you for sending the patch over!

To match the style of other patches you'd need to capitalise "PCI" in
the subject, see the following for some examples:

 $ git log --oneline drivers/pci/pci.c 

Also, it might be worth mentioning in the subject that this is a new API
that will be added.

> Device drivers use this API to proactively check if the device
> is alive or not. It is helpful for some PCI devices to detect
> surprise removal and do recovery when Hotplug function is disabled.
> 
> Note: Device in power states larger than D0 is also treated not alive
> by this function.
[...]

Question to you: do you have any particular users of this new API in
mind?  Or is this solving some problem you've seen and/or reported via
the kernel Bugzilla?

> +/**
> + * pci_dev_is_alive - check the pci device is alive or not
> + * @pdev: the PCI device

That would be "PCI" in the function brief above.  Also, try to be
consistent and capitalise everything plus add missing periods, see the
following for an example on how to write kernel-doc:

  https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation

> + * Device drivers use this API to proactively check if the device
> + * is alive or not. It is helpful for some PCI devices to detect
> + * surprise removal and do recovery when Hotplug function is disabled.

As per my question above - what users of this new API do you have in
mind?  Are they any other patches pending adding users of this API?

> + * Note: Device in power state larger than D0 is also treated not alive
> + * by this function.
> + *
> + * Returns true if the device is alive.
> + */
> +bool pci_dev_is_alive(struct pci_dev *pdev)
> +{
> +	u16 vendor;
> +
> +	pci_read_config_word(pdev, PCI_VENDOR_ID, &vendor);
> +
> +	return vendor == pdev->vendor;
> +}
> +EXPORT_SYMBOL(pci_dev_is_alive);

Why not use the EXPORT_SYMBOL_GPL()?

	Krzysztof

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pci: add pci_dev_is_alive API
  2021-05-25 12:59 [PATCH] pci: add pci_dev_is_alive API Lambert Wang
  2021-05-25 13:20 ` Krzysztof Wilczyński
@ 2021-05-25 14:08 ` Bjorn Helgaas
  1 sibling, 0 replies; 8+ messages in thread
From: Bjorn Helgaas @ 2021-05-25 14:08 UTC (permalink / raw)
  To: Lambert Wang
  Cc: Bjorn Helgaas, linux-pci, linux-kernel, Krzysztof Wilczyński

[+cc Krzysztof]

On Tue, May 25, 2021 at 08:59:25PM +0800, Lambert Wang wrote:
> Device drivers use this API to proactively check if the device
> is alive or not. It is helpful for some PCI devices to detect
> surprise removal and do recovery when Hotplug function is disabled.
> 
> Note: Device in power states larger than D0 is also treated not alive
> by this function.
> 
> Signed-off-by: Lambert Wang <lambert.q.wang@gmail.com>
> ---
>  drivers/pci/pci.c   | 23 +++++++++++++++++++++++
>  include/linux/pci.h |  1 +
>  2 files changed, 24 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b717680377a9..8a7c039b1cd5 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4659,6 +4659,29 @@ int pcie_flr(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_GPL(pcie_flr);
>  
> +/**
> + * pci_dev_is_alive - check the pci device is alive or not
> + * @pdev: the PCI device
> + *
> + * Device drivers use this API to proactively check if the device
> + * is alive or not. It is helpful for some PCI devices to detect
> + * surprise removal and do recovery when Hotplug function is disabled.
> + *
> + * Note: Device in power state larger than D0 is also treated not alive
> + * by this function.
> + *
> + * Returns true if the device is alive.
> + */
> +bool pci_dev_is_alive(struct pci_dev *pdev)
> +{
> +	u16 vendor;
> +
> +	pci_read_config_word(pdev, PCI_VENDOR_ID, &vendor);
> +
> +	return vendor == pdev->vendor;
> +}
> +EXPORT_SYMBOL(pci_dev_is_alive);

This is quite similar to pci_device_is_present().  Does that not do
what you need?

I'm not a big fan of either pci_device_is_present() or
pci_dev_is_alive() because they are racy.  You must assume that even
if they return "true," the device may disappear because of a surprise
removal before you can act on that information.

>  static int pci_af_flr(struct pci_dev *dev, int probe)
>  {
>  	int pos;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index c20211e59a57..2a3ba06a7347 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1227,6 +1227,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>  void pcie_print_link_status(struct pci_dev *dev);
>  bool pcie_has_flr(struct pci_dev *dev);
>  int pcie_flr(struct pci_dev *dev);
> +bool pci_dev_is_alive(struct pci_dev *pdev);
>  int __pci_reset_function_locked(struct pci_dev *dev);
>  int pci_reset_function(struct pci_dev *dev);
>  int pci_reset_function_locked(struct pci_dev *dev);
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pci: add pci_dev_is_alive API
  2021-05-25 13:20 ` Krzysztof Wilczyński
@ 2021-05-26  6:12   ` Lambert Wang
  2021-05-26 16:23     ` Bjorn Helgaas
  2021-05-26 18:18     ` Lukas Wunner
  0 siblings, 2 replies; 8+ messages in thread
From: Lambert Wang @ 2021-05-26  6:12 UTC (permalink / raw)
  To: Krzysztof Wilczyński, Bjorn Helgaas; +Cc: linux-pci, linux-kernel

Hi  Krzysztof and Bjorn,

Thanks for your comments and your time.Your questions are answered inline.

I have checked and tested *pci_device_is_present* as pointed out by Bjorn.
I think that API could satisfy my needs.

So this patch request could be dropped. Sorry for the inconvenience.

On Tue, May 25, 2021 at 9:20 PM Krzysztof Wilczyński <kw@linux.com> wrote:
>
> Hi Lambert,
>
> Thank you for sending the patch over!
>
> To match the style of other patches you'd need to capitalise "PCI" in
> the subject, see the following for some examples:
>
>  $ git log --oneline drivers/pci/pci.c
>
> Also, it might be worth mentioning in the subject that this is a new API
> that will be added.
>

I will be careful on the styles of patch title and description in my
future patches.

> > Device drivers use this API to proactively check if the device
> > is alive or not. It is helpful for some PCI devices to detect
> > surprise removal and do recovery when Hotplug function is disabled.
> >
> > Note: Device in power states larger than D0 is also treated not alive
> > by this function.
> [...]
>
> Question to you: do you have any particular users of this new API in
> mind?  Or is this solving some problem you've seen and/or reported via
> the kernel Bugzilla?
>

The user is our new PCI driver under development for WWAN devices .
Surprise removal could happen under multiple circumstances.
e.g. Exception, Link Failure, etc.

We wanted this API to detect surprise removal or check device recovery
when AER and Hotplug are disabled.

I thought the API could be commonly used for many similar devices.

> > +/**
> > + * pci_dev_is_alive - check the pci device is alive or not
> > + * @pdev: the PCI device
>
> That would be "PCI" in the function brief above.  Also, try to be
> consistent and capitalise everything plus add missing periods, see the
> following for an example on how to write kernel-doc:
>
>   https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation
>
> > + * Device drivers use this API to proactively check if the device
> > + * is alive or not. It is helpful for some PCI devices to detect
> > + * surprise removal and do recovery when Hotplug function is disabled.
>
> As per my question above - what users of this new API do you have in
> mind?  Are they any other patches pending adding users of this API?
>
> > + * Note: Device in power state larger than D0 is also treated not alive
> > + * by this function.
> > + *
> > + * Returns true if the device is alive.
> > + */
> > +bool pci_dev_is_alive(struct pci_dev *pdev)
> > +{
> > +     u16 vendor;
> > +
> > +     pci_read_config_word(pdev, PCI_VENDOR_ID, &vendor);
> > +
> > +     return vendor == pdev->vendor;
> > +}
> > +EXPORT_SYMBOL(pci_dev_is_alive);
>
> Why not use the EXPORT_SYMBOL_GPL()?
>
>         Krzysztof

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pci: add pci_dev_is_alive API
  2021-05-26  6:12   ` Lambert Wang
@ 2021-05-26 16:23     ` Bjorn Helgaas
  2021-05-27  3:43       ` Lambert Wang
  2021-05-26 18:18     ` Lukas Wunner
  1 sibling, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2021-05-26 16:23 UTC (permalink / raw)
  To: Lambert Wang
  Cc: Krzysztof Wilczyński, Bjorn Helgaas, linux-pci, linux-kernel

On Wed, May 26, 2021 at 02:12:38PM +0800, Lambert Wang wrote:
> ...
> The user is our new PCI driver under development for WWAN devices .
> Surprise removal could happen under multiple circumstances.
> e.g. Exception, Link Failure, etc.
> 
> We wanted this API to detect surprise removal or check device recovery
> when AER and Hotplug are disabled.
> 
> I thought the API could be commonly used for many similar devices.

Be careful with this.  pci_device_is_present() is not a good way to
detect surprise removal.  Surprise removal can happen at any time, for
example, it can occur after you call pci_device_is_present() but
before you use the result:

  present = pci_device_is_present(pdev);
  /* present == true */
  /* device may be removed here */
  if (present)
    xxx; /* this operation may fail */

You have to assume that *any* operation on the device can fail because
the device has been removed.  In general, there's no response for a
PCIe write to the device, so you can't really check whether a write
has failed.

There *are* responses for reads, of course, if the device has been
removed, a read will cause a failure response.  Most PCIe controllers
turn that response into ~0 data to satisfy the read.  So the only
reliable way to detect surprise removal is to check for ~0 data when
doing an MMIO read from the device.  Of course, ~0 may be either valid
data or a symptom of a failure response, so you may have to do
additional work to distinguish those two cases.

Bjorn

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pci: add pci_dev_is_alive API
  2021-05-26  6:12   ` Lambert Wang
  2021-05-26 16:23     ` Bjorn Helgaas
@ 2021-05-26 18:18     ` Lukas Wunner
  2021-05-27  3:52       ` Lambert Wang
  1 sibling, 1 reply; 8+ messages in thread
From: Lukas Wunner @ 2021-05-26 18:18 UTC (permalink / raw)
  To: Lambert Wang
  Cc: Krzysztof Wilczy??ski, Bjorn Helgaas, linux-pci, linux-kernel

On Wed, May 26, 2021 at 02:12:38PM +0800, Lambert Wang wrote:
> The user is our new PCI driver under development for WWAN devices .
> Surprise removal could happen under multiple circumstances.
> e.g. Exception, Link Failure, etc.
> 
> We wanted this API to detect surprise removal or check device recovery
> when AER and Hotplug are disabled.

You may want to take a look at pci_dev_is_disconnected().

Be aware of its limitations, which Bjorn has already pointed out
and which are discussed in more detail under the following link
in the "Surprise removal" section:

https://lwn.net/Articles/767885/

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pci: add pci_dev_is_alive API
  2021-05-26 16:23     ` Bjorn Helgaas
@ 2021-05-27  3:43       ` Lambert Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Lambert Wang @ 2021-05-27  3:43 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Krzysztof Wilczyński, Bjorn Helgaas, linux-pci, linux-kernel

On Thu, May 27, 2021 at 12:23 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Wed, May 26, 2021 at 02:12:38PM +0800, Lambert Wang wrote:
> > ...
> > The user is our new PCI driver under development for WWAN devices .
> > Surprise removal could happen under multiple circumstances.
> > e.g. Exception, Link Failure, etc.
> >
> > We wanted this API to detect surprise removal or check device recovery
> > when AER and Hotplug are disabled.
> >
> > I thought the API could be commonly used for many similar devices.
>
> Be careful with this.  pci_device_is_present() is not a good way to
> detect surprise removal.  Surprise removal can happen at any time, for
> example, it can occur after you call pci_device_is_present() but
> before you use the result:
>
>   present = pci_device_is_present(pdev);
>   /* present == true */
>   /* device may be removed here */
>   if (present)
>     xxx; /* this operation may fail */
>
> You have to assume that *any* operation on the device can fail because
> the device has been removed.  In general, there's no response for a
> PCIe write to the device, so you can't really check whether a write
> has failed.
>
> There *are* responses for reads, of course, if the device has been
> removed, a read will cause a failure response.  Most PCIe controllers
> turn that response into ~0 data to satisfy the read.  So the only
> reliable way to detect surprise removal is to check for ~0 data when
> doing an MMIO read from the device.  Of course, ~0 may be either valid
> data or a symptom of a failure response, so you may have to do
> additional work to distinguish those two cases.

Thanks for reminding.  :)

Yes the check has race conditions. When the driver is doing recovery detection,
the check result is not reliable.

It is pretty useful when the driver wants to confirm if the device is
absent *after*
the driver finds it's not working as expected.

>
> Bjorn

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pci: add pci_dev_is_alive API
  2021-05-26 18:18     ` Lukas Wunner
@ 2021-05-27  3:52       ` Lambert Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Lambert Wang @ 2021-05-27  3:52 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Krzysztof Wilczy??ski, Bjorn Helgaas, linux-pci, linux-kernel

On Thu, May 27, 2021 at 2:18 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Wed, May 26, 2021 at 02:12:38PM +0800, Lambert Wang wrote:
> > The user is our new PCI driver under development for WWAN devices .
> > Surprise removal could happen under multiple circumstances.
> > e.g. Exception, Link Failure, etc.
> >
> > We wanted this API to detect surprise removal or check device recovery
> > when AER and Hotplug are disabled.
>
> You may want to take a look at pci_dev_is_disconnected().
>
> Be aware of its limitations, which Bjorn has already pointed out
> and which are discussed in more detail under the following link
> in the "Surprise removal" section:
>
> https://lwn.net/Articles/767885/
>

Thanks for the suggestion and the article. Currently I prefer
pci_device_is_present() for my scenario.

e.g. pci_dev_is_disconnected() seems to use a cached value. If the
driver wants to check the device's absence
after it *senses* something abnormal, pci_device_is_present() is more suitable.

> Thanks,
>
> Lukas

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-05-27  3:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-25 12:59 [PATCH] pci: add pci_dev_is_alive API Lambert Wang
2021-05-25 13:20 ` Krzysztof Wilczyński
2021-05-26  6:12   ` Lambert Wang
2021-05-26 16:23     ` Bjorn Helgaas
2021-05-27  3:43       ` Lambert Wang
2021-05-26 18:18     ` Lukas Wunner
2021-05-27  3:52       ` Lambert Wang
2021-05-25 14:08 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).