linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] pci: Make return value of pcie_capability_read*() consistent
@ 2020-04-24 14:27 Bolarinwa Olayemi Saheed
  2020-04-24 22:30 ` Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Bolarinwa Olayemi Saheed @ 2020-04-24 14:27 UTC (permalink / raw)
  To: helgaas; +Cc: Bolarinwa Olayemi Saheed, bjorn, yangyicong, skhan, linux-pci

pcie_capability_read*() could return 0, -EINVAL, or any of the
PCIBIOS_* error codes (which are positive).
This is behaviour is now changed to return only PCIBIOS_* error
codes on error.
This is consistent with pci_read_config_*(). Callers can now have
a consistent way for checking which error has occurred.

An audit of the callers of this function was made and no case was found
where there is need for a change within the caller function or their
dependencies down the heirarchy.
Out of all caller functions discovered only 8 functions either persist the
return value of pcie_capability_read*() or directly pass on the return
value.

1.) "./drivers/infiniband/hw/hfi1/pcie.c" :
=> pcie_speeds() line-306

	if (ret) {
		dd_dev_err(dd, "Unable to read from PCI config\n");
		return ret;
	}

remarks: The variable "ret" is the captured return value.
         This function passes on the return value. The return value was
	 store only by hfi1_init_dd() line-15076 in
         ./drivers/infiniband/hw/hfi1/chip.c and it behave the same on all
	 errors. So this patch will not require a change in this function.

=> update_lbus_info() line-276

	if (ret) {
		dd_dev_err(dd, "Unable to read from PCI config\n");
		return;
	}

remarks: see below
=> save_pci_variables() line-415, 420, 425

	if (ret)
		goto error;

remarks: see below

=> tune_pcie_caps() line-471

	if ((!ret) && !(ectl & PCI_EXP_DEVCTL_EXT_TAG)) {
		dd_dev_info(dd, "Enabling PCIe extended tags\n");
		ectl |= PCI_EXP_DEVCTL_EXT_TAG;
		ret = pcie_capability_write_word(dd->pcidev,
						 PCI_EXP_DEVCTL, ectl);
		if (ret)
		     dd_dev_info(dd, "Unable to write to PCI config\n");
	}

remarks: see below

=> do_pcie_gen3_transition() line-1247, 1274

	if (ret) {
		dd_dev_err(dd, "Unable to read from PCI config\n");
		return_error = 1;
		goto done;
	}

remarks: The variable "ret" is the captured return value.
         These functions' behaviour is the same on all errors, so they are
	 not be affected by this patch.

2.) "./drivers/net/wireless/realtek/rtw88/pci.c":
=> rtw_pci_link_cfg*() line-1221

	if (ret) {
		rtw_err(rtwdev, "failed to read PCI cap, ret=%d\n", ret);
		return;
	}

remark: The variable "ret" is the captured return value.
        This function returns on all errors, so it will not be affected
	by this patch.

3.) "./drivers/pci/hotplug/pciehp_hpc.c":
=> pciehp_check_link_active() line-219

	if (ret == PCIBIOS_DEVICE_NOT_FOUND || lnk_status == (u16)~0)
		return -ENODEV;

remark: see below

=>pciehp_card_present() line-404

	{Same code as above}

remark: The variable "ret" is the captured return value.
        This 2 functions will not be affected by this patch since they are
	only testing for *DEVICE_NOT_FOUND error.

4.) "./drivers/pci/pcie/bw_notification.c":
=>pcie_link_bandwidth_notification_supported() line-26

	return (ret == PCIBIOS_SUCCESSFUL)
			&& (lnk_cap & PCI_EXP_LNKCAP_LBNC);

remark: see below

=>pcie_bw_notification_irq() line-56

	if (ret != PCIBIOS_SUCCESSFUL || !events)
		return IRQ_NONE;

remark: The variable "ret" is the captured return value.
        In these 2 functions returning PCIBIOS_BAD_REGISTER_NUMBER instead
	of -EINVAL as done in this patch will not affect the behaviour.

5.) "./drivers/pci/probe.c":
=> pci_configure_extended_tags() line-1951

	if (ret)
		return 0;

remark: The variable "ret" is the captured return value.
        This function will not be affected by this patch since it retuns 0
	on ALL error codes.

6.) "./drivers/pci/pci-sysfs.c":
=> current_link_speed_show() line-180

	if (err)
		return -EINVAL;

remark: see below

=>current_link_width_show() line-215

        {same code as in the above function}

remark: The variable "err" is the captured return value.
        This 2 functions will not be affected directly by this patch since
	it retuns -EINVAL on ALL error codes. However, depending on the
	intent, after this patch, this may now be to too generic. This is
	because it will then be possible to use the returned PCIBIOS_*
	error code to identify the error.

7.) "./drivers/dma/ioat/init.c":
=>ioat3_dma_probe() line-1193

	if (err)
		return err;

remark: The variable "ret" is the captured return value.
        This function passes on the return value. Only ioat_pci_probe()
	line-1392 in the same file persists the return value and it's
	behaviour is the same on all errors. So this patch will not change
	the behaviour of these functions.

8.) "./drivers/pci/access.c":
=>pcie_capability_clear_and_set_word() line-493

	if (!ret) {
		val &= ~clear;
		val |= set;
		ret = pcie_capability_write_word(dev, pos, val);
	}

	return ret;

=>pcie_capability_clear_and_set_dword() line-508

    {same as above function}

remark: The variable "ret" is the captured return value.
     This 2 functions will not be affected directly. But after this patch
     they will now be returning PCIBIOS_BAD_REGISTER_NUMBER instead of
     -EINVAL. No case was found where the return value of any of both
     functions were persisted. But their return values are passed on
     directly by:
	- pcie_capability_set_{word|dword}() and
	  pcie_capability_clear_{word|dword}() in ./include/linux/pci.h
          lines(1100-1136): these pass on the received return values
	  directly to:
          - pcie_capability_clear_dword() is not referenced anywhere
          - pcie_capability_set_dword() is referenced but it's return
	    values are not cached
          - pcie_capability_{set, clear}_word() : return value passed on
	    by pci_enable_pcie_error_reporting() in drivers/pci/pcie/aer.c
	    lines-(350,362) these are used by other drivers to log errors,
            in all examined cases all errors are treated the same.
	  - pcie_capability_clear_word() : return value passed on by
	    pci_disable_pcie_error_reporting() in drivers/pci/pcie/aer.c
	    lines-362 which treats all errors are treated the same.

	- pcie_set_readrq() line-5662 and pcie_set_mps() line-5703 in
	  ./drivers/pci/pci.c : both functions pass on the return value of
	  pcie_capability_clear_and_set_word() but also return -EINVAL in
	  cases of some other errors. Currently these errors will not be
	  differentiated, this patch will help differentiate errors in
	  this kind of situation. The function will not be affected but
	  rather it will be enhanced in correctness.

        - pqi_set_pcie_completion_timeout() line-7423 in
	  ./drivers/scsi/smartpqi/smartpqi_init.c : This function will not
          be affected. Although, it passes on the return value, all error
          values are handled the same way by the only reference found at
          line-7473 in the same file.

Signed-off-by: Bolarinwa Olayemi Saheed <refactormyself@gmail.com>
Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
---
Changes in version 4:
 - make patch independent of earlier versions
 - add commit log
 - add justificaation and report on audit of affected functions

 drivers/pci/access.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 79c4a2ef269a..f0baab635b66 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -402,6 +402,10 @@ static bool pcie_capability_reg_implemented(struct pci_dev *dev, int pos)
  * Note that these accessor functions are only for the "PCI Express
  * Capability" (see PCIe spec r3.0, sec 7.8).  They do not apply to the
  * other "PCI Express Extended Capabilities" (AER, VC, ACS, MFVC, etc.)
+ *
+ * On error, this function returns a PCIBIOS_* error code,
+ * you may want to use pcibios_err_to_errno()(include/linux/pci.h)
+ * to convert to a non-PCI code.
  */
 int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val)
 {
@@ -409,7 +413,7 @@ int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val)
 
 	*val = 0;
 	if (pos & 1)
-		return -EINVAL;
+		return PCIBIOS_BAD_REGISTER_NUMBER;
 
 	if (pcie_capability_reg_implemented(dev, pos)) {
 		ret = pci_read_config_word(dev, pci_pcie_cap(dev) + pos, val);
@@ -444,7 +448,7 @@ int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val)
 
 	*val = 0;
 	if (pos & 3)
-		return -EINVAL;
+		return PCIBIOS_BAD_REGISTER_NUMBER;
 
 	if (pcie_capability_reg_implemented(dev, pos)) {
 		ret = pci_read_config_dword(dev, pci_pcie_cap(dev) + pos, val);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] pci: Make return value of pcie_capability_read*() consistent
  2020-04-24 14:27 [PATCH v4] pci: Make return value of pcie_capability_read*() consistent Bolarinwa Olayemi Saheed
@ 2020-04-24 22:30 ` Bjorn Helgaas
  2020-04-26  9:51   ` Saheed Bolarinwa
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2020-04-24 22:30 UTC (permalink / raw)
  To: Bolarinwa Olayemi Saheed; +Cc: bjorn, yangyicong, skhan, linux-pci

Hi Saheed,

On Fri, Apr 24, 2020 at 04:27:11PM +0200, Bolarinwa Olayemi Saheed wrote:
> pcie_capability_read*() could return 0, -EINVAL, or any of the
> PCIBIOS_* error codes (which are positive).
> This is behaviour is now changed to return only PCIBIOS_* error
> codes on error.
> This is consistent with pci_read_config_*(). Callers can now have
> a consistent way for checking which error has occurred.
> 
> An audit of the callers of this function was made and no case was found
> where there is need for a change within the caller function or their
> dependencies down the heirarchy.
> Out of all caller functions discovered only 8 functions either persist the
> return value of pcie_capability_read*() or directly pass on the return
> value.
> 
> 1.) "./drivers/infiniband/hw/hfi1/pcie.c" :
> => pcie_speeds() line-306
> 
> 	if (ret) {
> 		dd_dev_err(dd, "Unable to read from PCI config\n");
> 		return ret;
> 	}
> 
> remarks: The variable "ret" is the captured return value.
>          This function passes on the return value. The return value was
> 	 store only by hfi1_init_dd() line-15076 in
>          ./drivers/infiniband/hw/hfi1/chip.c and it behave the same on all
> 	 errors. So this patch will not require a change in this function.

Thanks for the analysis, but I don't think it's quite complete.
Here's the call chain I see:

  local_pci_probe
    pci_drv->probe(..)
      init_one                        # hfi1_pci_driver.probe method
        hfi1_init_dd
          pcie_speeds
            pcie_capability_read_dword

If pcie_capability_read_dword() returns any non-zero value, that value
propagates all the way up and is eventually returned by init_one().
init_one() id called by local_pci_probe(), which interprets:

  < 0 as failure
    0 as success, and
  > 0 as "success but warn"

So previously an error from pcie_capability_read_dword() could cause
either failure or "success but warn" for the probe method, and after
this patch those errors will always cause "success but warn".

The current behavior is definitely a bug: if
pci_bus_read_config_word() returns PCIBIOS_BAD_REGISTER_NUMBER, that
causes pcie_capability_read_dword() to also return
PCIBIOS_BAD_REGISTER_NUMBER, which will lead to the probe succeeding
with a warning, when it should fail.

I think the fix is to make pcie_speeds() call pcibios_err_to_errno():

  ret = pcie_capability_read_dword(...);
  if (ret) {
    dd_dev_err(...);
    return pcibios_err_to_errno(ret);
  }

That could be its own separate preparatory patch before this
adjustment to pcie_capability_read_dword().

I didn't look at the other cases below, so I don't know whether they
are similar hidden problems.

> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 79c4a2ef269a..f0baab635b66 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -402,6 +402,10 @@ static bool pcie_capability_reg_implemented(struct pci_dev *dev, int pos)
>   * Note that these accessor functions are only for the "PCI Express
>   * Capability" (see PCIe spec r3.0, sec 7.8).  They do not apply to the
>   * other "PCI Express Extended Capabilities" (AER, VC, ACS, MFVC, etc.)
> + *
> + * On error, this function returns a PCIBIOS_* error code,
> + * you may want to use pcibios_err_to_errno()(include/linux/pci.h)
> + * to convert to a non-PCI code.
>   */
>  int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val)
>  {
> @@ -409,7 +413,7 @@ int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val)
>  
>  	*val = 0;
>  	if (pos & 1)
> -		return -EINVAL;
> +		return PCIBIOS_BAD_REGISTER_NUMBER;
>  
>  	if (pcie_capability_reg_implemented(dev, pos)) {
>  		ret = pci_read_config_word(dev, pci_pcie_cap(dev) + pos, val);
> @@ -444,7 +448,7 @@ int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val)
>  
>  	*val = 0;
>  	if (pos & 3)
> -		return -EINVAL;
> +		return PCIBIOS_BAD_REGISTER_NUMBER;
>  
>  	if (pcie_capability_reg_implemented(dev, pos)) {
>  		ret = pci_read_config_dword(dev, pci_pcie_cap(dev) + pos, val);

We need to make similar changes to pcie_capability_write_word() and
pcie_capability_write_dword(), of course.  I think it makes sense to
do them all in the same patch, since it's logically the same change
and all these functions should be consistent with each other.

Thanks for your work so far!  I know it's tedious and painful.  But
cleaning this up will make things a little bit less painful for those
who come after us :)

Bjorn

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] pci: Make return value of pcie_capability_read*() consistent
  2020-04-24 22:30 ` Bjorn Helgaas
@ 2020-04-26  9:51   ` Saheed Bolarinwa
  2020-04-27 18:13     ` Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Saheed Bolarinwa @ 2020-04-26  9:51 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: bjorn, yangyicong, skhan, linux-pci

Hello Bjorn,

On 4/25/20 12:30 AM, Bjorn Helgaas wrote:
> Hi Saheed,
>
> On Fri, Apr 24, 2020 at 04:27:11PM +0200, Bolarinwa Olayemi Saheed wrote:
>> pcie_capability_read*() could return 0, -EINVAL, or any of the
>> PCIBIOS_* error codes (which are positive).
>> This is behaviour is now changed to return only PCIBIOS_* error
>> codes on error.
>> This is consistent with pci_read_config_*(). Callers can now have
>> a consistent way for checking which error has occurred.
>>
>> An audit of the callers of this function was made and no case was found
>> where there is need for a change within the caller function or their
>> dependencies down the heirarchy.
>> Out of all caller functions discovered only 8 functions either persist the
>> return value of pcie_capability_read*() or directly pass on the return
>> value.
>>
>> 1.) "./drivers/infiniband/hw/hfi1/pcie.c" :
>> => pcie_speeds() line-306
>>
>> 	if (ret) {
>> 		dd_dev_err(dd, "Unable to read from PCI config\n");
>> 		return ret;
>> 	}
>>
>> remarks: The variable "ret" is the captured return value.
>>           This function passes on the return value. The return value was
>> 	 store only by hfi1_init_dd() line-15076 in
>>           ./drivers/infiniband/hw/hfi1/chip.c and it behave the same on all
>> 	 errors. So this patch will not require a change in this function.
> Thanks for the analysis, but I don't think it's quite complete.
> Here's the call chain I see:
>
>    local_pci_probe
>      pci_drv->probe(..)
>        init_one                        # hfi1_pci_driver.probe method
>          hfi1_init_dd
>            pcie_speeds
>              pcie_capability_read_dword

Thank you for pointing out the call chain. After checking it, I noticed 
that the

error is handled within the chain in two places without being passed on.

1. init_one() in ./drivers/infiniband/hw/hfil1/init.c

      ret = hfi1_init_dd(dd);
             if (ret)
                     goto clean_bail; /* error already printed */

    ...
    clean_bail:
             hfi1_pcie_cleanup(pdev);  /*EXITS*/

2. hfi1_init_dd() in ./drivers/infiniband/hw/hfil1/chip.c

         ret = pcie_speeds(dd);
         if (ret)
                 goto bail_cleanup;

         ...

         bail_cleanup:
                  hfi1_pcie_ddcleanup(dd);  /*EXITS*/

> If pcie_capability_read_dword() returns any non-zero value, that value
> propagates all the way up and is eventually returned by init_one().
> init_one() id called by local_pci_probe(), which interprets:
>
>    < 0 as failure
>      0 as success, and
>    > 0 as "success but warn"
>
> So previously an error from pcie_capability_read_dword() could cause
> either failure or "success but warn" for the probe method, and after
> this patch those errors will always cause "success but warn".
>
> The current behavior is definitely a bug: if
> pci_bus_read_config_word() returns PCIBIOS_BAD_REGISTER_NUMBER, that
> causes pcie_capability_read_dword() to also return
> PCIBIOS_BAD_REGISTER_NUMBER, which will lead to the probe succeeding
> with a warning, when it should fail.
>
> I think the fix is to make pcie_speeds() call pcibios_err_to_errno():
>
>    ret = pcie_capability_read_dword(...);
>    if (ret) {
>      dd_dev_err(...);
>      return pcibios_err_to_errno(ret);
>    }

I agree that this fix is needed, so that PCIBIOS_* error code are not 
passed on but replaced

with one consistent with non-PCI error codes.

> That could be its own separate preparatory patch before this
> adjustment to pcie_capability_read_dword().
>
> I didn't look at the other cases below, so I don't know whether they
> are similar hidden problems.

I will check again, please I will like to clarify if it will be to fine 
to just implement the conversion

(as suggested for pcie_speeds) in all found references, which passes on 
the error code.

>
>> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
>> index 79c4a2ef269a..f0baab635b66 100644
>> --- a/drivers/pci/access.c
>> +++ b/drivers/pci/access.c
>> @@ -402,6 +402,10 @@ static bool pcie_capability_reg_implemented(struct pci_dev *dev, int pos)
>>    * Note that these accessor functions are only for the "PCI Express
>>    * Capability" (see PCIe spec r3.0, sec 7.8).  They do not apply to the
>>    * other "PCI Express Extended Capabilities" (AER, VC, ACS, MFVC, etc.)
>> + *
>> + * On error, this function returns a PCIBIOS_* error code,
>> + * you may want to use pcibios_err_to_errno()(include/linux/pci.h)
>> + * to convert to a non-PCI code.
>>    */
>>   int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val)
>>   {
>> @@ -409,7 +413,7 @@ int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val)
>>   
>>   	*val = 0;
>>   	if (pos & 1)
>> -		return -EINVAL;
>> +		return PCIBIOS_BAD_REGISTER_NUMBER;
>>   
>>   	if (pcie_capability_reg_implemented(dev, pos)) {
>>   		ret = pci_read_config_word(dev, pci_pcie_cap(dev) + pos, val);
>> @@ -444,7 +448,7 @@ int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val)
>>   
>>   	*val = 0;
>>   	if (pos & 3)
>> -		return -EINVAL;
>> +		return PCIBIOS_BAD_REGISTER_NUMBER;
>>   
>>   	if (pcie_capability_reg_implemented(dev, pos)) {
>>   		ret = pci_read_config_dword(dev, pci_pcie_cap(dev) + pos, val);
> We need to make similar changes to pcie_capability_write_word() and
> pcie_capability_write_dword(), of course.  I think it makes sense to
> do them all in the same patch, since it's logically the same change
> and all these functions should be consistent with each other.

I will include them in.

Thank you.

- Saheed


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] pci: Make return value of pcie_capability_read*() consistent
  2020-04-26  9:51   ` Saheed Bolarinwa
@ 2020-04-27 18:13     ` Bjorn Helgaas
  2020-04-28  2:19       ` Yicong Yang
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2020-04-27 18:13 UTC (permalink / raw)
  To: Saheed Bolarinwa
  Cc: bjorn, yangyicong, skhan, linux-pci, Thomas Bogendoerfer,
	linux-mips, Michael Ellerman, linuxppc-dev, linux-kernel

[+cc Thomas, Michael, linux-mips, linux-ppc, LKML
Background:

  - PCI config accessors (pci_read_config_word(), etc) return 0 or a
    positive error (PCIBIOS_BAD_REGISTER_NUMBER, etc).

  - PCI Express capability accessors (pcie_capability_read_word(),
    etc) return 0, a negative error (-EINVAL), or a positive error
    (PCIBIOS_BAD_REGISTER_NUMBER, etc).

  - The PCI Express case is hard for callers to deal with.  The
    original plan was to convert this case to either return 0 or
    positive errors, just like pci_read_config_word().

  - I'm raising the possibility of instead getting rid of the positive
    PCIBIOS_* error values completely and replacing them with -EINVAL,
    -ENOENT, etc.

  - Very few callers check the return codes at all.  Most of the ones
    that do either check for non-zero or use pcibios_err_to_errno() to
    convert PCIBIOS_* to -EINVAL, etc.

I added MIPS and powerpc folks to CC: just as FYI because you're the
biggest users of PCIBIOS_*.  The intent is that this would be zero
functional change.
]

On Sun, Apr 26, 2020 at 11:51:30AM +0200, Saheed Bolarinwa wrote:
> On 4/25/20 12:30 AM, Bjorn Helgaas wrote:
> > On Fri, Apr 24, 2020 at 04:27:11PM +0200, Bolarinwa Olayemi Saheed wrote:
> > > pcie_capability_read*() could return 0, -EINVAL, or any of the
> > > PCIBIOS_* error codes (which are positive).
> > > This is behaviour is now changed to return only PCIBIOS_* error
> > > codes on error.
> > > This is consistent with pci_read_config_*(). Callers can now have
> > > a consistent way for checking which error has occurred.
> > > 
> > > An audit of the callers of this function was made and no case was found
> > > where there is need for a change within the caller function or their
> > > dependencies down the heirarchy.
> > > Out of all caller functions discovered only 8 functions either persist the
> > > return value of pcie_capability_read*() or directly pass on the return
> > > value.
> > > 
> > > 1.) "./drivers/infiniband/hw/hfi1/pcie.c" :
> > > => pcie_speeds() line-306
> > > 
> > > 	if (ret) {
> > > 		dd_dev_err(dd, "Unable to read from PCI config\n");
> > > 		return ret;
> > > 	}
> > > 
> > > remarks: The variable "ret" is the captured return value.
> > >           This function passes on the return value. The return value was
> > > 	 store only by hfi1_init_dd() line-15076 in
> > >           ./drivers/infiniband/hw/hfi1/chip.c and it behave the same on all
> > > 	 errors. So this patch will not require a change in this function.
> > Thanks for the analysis, but I don't think it's quite complete.
> > Here's the call chain I see:
> > 
> >    local_pci_probe
> >      pci_drv->probe(..)
> >        init_one                        # hfi1_pci_driver.probe method
> >          hfi1_init_dd
> >            pcie_speeds
> >              pcie_capability_read_dword
> 
> Thank you for pointing out the call chain. After checking it, I noticed that
> the
> 
> error is handled within the chain in two places without being passed on.
> 
> 1. init_one() in ./drivers/infiniband/hw/hfil1/init.c
> 
>      ret = hfi1_init_dd(dd);
>             if (ret)
>                     goto clean_bail; /* error already printed */
> 
>    ...
>    clean_bail:
>             hfi1_pcie_cleanup(pdev);  /*EXITS*/
> 
> 2. hfi1_init_dd() in ./drivers/infiniband/hw/hfil1/chip.c
> 
>         ret = pcie_speeds(dd);
>         if (ret)
>                 goto bail_cleanup;
> 
>         ...
> 
>         bail_cleanup:
>                  hfi1_pcie_ddcleanup(dd);  /*EXITS*/
> 
> > If pcie_capability_read_dword() returns any non-zero value, that value
> > propagates all the way up and is eventually returned by init_one().
> > init_one() id called by local_pci_probe(), which interprets:
> > 
> >    < 0 as failure
> >      0 as success, and
> >    > 0 as "success but warn"
> > 
> > So previously an error from pcie_capability_read_dword() could cause
> > either failure or "success but warn" for the probe method, and after
> > this patch those errors will always cause "success but warn".
> > 
> > The current behavior is definitely a bug: if
> > pci_bus_read_config_word() returns PCIBIOS_BAD_REGISTER_NUMBER, that
> > causes pcie_capability_read_dword() to also return
> > PCIBIOS_BAD_REGISTER_NUMBER, which will lead to the probe succeeding
> > with a warning, when it should fail.
> > 
> > I think the fix is to make pcie_speeds() call pcibios_err_to_errno():
> > 
> >    ret = pcie_capability_read_dword(...);
> >    if (ret) {
> >      dd_dev_err(...);
> >      return pcibios_err_to_errno(ret);
> >    }
> 
> I agree that this fix is needed, so that PCIBIOS_* error code are
> not passed on but replaced
> 
> with one consistent with non-PCI error codes.
> 
> > That could be its own separate preparatory patch before this
> > adjustment to pcie_capability_read_dword().
> > 
> > I didn't look at the other cases below, so I don't know whether
> > they are similar hidden problems.
> 
> I will check again, please I will like to clarify if it will be to
> fine to just implement the conversion
> 
> (as suggested for pcie_speeds) in all found references, which passes
> on the error code.

I'm starting to think we're approaching this backwards.  I searched
for PCIBIOS_FUNC_NOT_SUPPORTED, PCIBIOS_BAD_VENDOR_ID, and the other
error values.  Almost every use is a *return* in a config accessor.
There are very, very few *tests* for these values.

For example, the only tests for PCIBIOS_FUNC_NOT_SUPPORTED are in
xen_pcibios_err_to_errno() and pcibios_err_to_errno(), i.e., we're
just converting that value to -ENOENT or the Xen-specific thing.

So I think the best approach might be to remove the PCIBIOS_* error
values completely and replace them with the corresponding values from
pcibios_err_to_errno().  For example, a part of the patch would look
like this:

diff --git a/arch/mips/pci/ops-emma2rh.c b/arch/mips/pci/ops-emma2rh.c
index 65f47344536c..d4d9c902c147 100644
--- a/arch/mips/pci/ops-emma2rh.c
+++ b/arch/mips/pci/ops-emma2rh.c
@@ -100,7 +100,7 @@ static int pci_config_read(struct pci_bus *bus, unsigned int devfn, int where,
 		break;
 	default:
 		emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
-		return PCIBIOS_FUNC_NOT_SUPPORTED;
+		return -ENOENT;
 	}
 
 	emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
@@ -149,7 +149,7 @@ static int pci_config_write(struct pci_bus *bus, unsigned int devfn, int where,
 		break;
 	default:
 		emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
-		return PCIBIOS_FUNC_NOT_SUPPORTED;
+		return -ENOENT;
 	}
 	*(volatile u32 *)(base + (PCI_FUNC(devfn) << 8) +
 			  (where & 0xfffffffc)) = data;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 83ce1cdf5676..f95637a8d391 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -675,7 +675,6 @@ static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false;
 
 /* Error values that may be returned by PCI functions */
 #define PCIBIOS_SUCCESSFUL		0x00
-#define PCIBIOS_FUNC_NOT_SUPPORTED	0x81
 #define PCIBIOS_BAD_VENDOR_ID		0x83
 #define PCIBIOS_DEVICE_NOT_FOUND	0x86
 #define PCIBIOS_BAD_REGISTER_NUMBER	0x87
@@ -689,8 +688,6 @@ static inline int pcibios_err_to_errno(int err)
 		return err; /* Assume already errno */
 
 	switch (err) {
-	case PCIBIOS_FUNC_NOT_SUPPORTED:
-		return -ENOENT;
 	case PCIBIOS_BAD_VENDOR_ID:
 		return -ENOTTY;
 	case PCIBIOS_DEVICE_NOT_FOUND:

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] pci: Make return value of pcie_capability_read*() consistent
  2020-04-27 18:13     ` Bjorn Helgaas
@ 2020-04-28  2:19       ` Yicong Yang
  2020-04-28 11:43         ` Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Yicong Yang @ 2020-04-28  2:19 UTC (permalink / raw)
  To: Bjorn Helgaas, Saheed Bolarinwa
  Cc: bjorn, skhan, linux-pci, Thomas Bogendoerfer, linux-mips,
	Michael Ellerman, linuxppc-dev, linux-kernel

On 2020/4/28 2:13, Bjorn Helgaas wrote:
>
> I'm starting to think we're approaching this backwards.  I searched
> for PCIBIOS_FUNC_NOT_SUPPORTED, PCIBIOS_BAD_VENDOR_ID, and the other
> error values.  Almost every use is a *return* in a config accessor.
> There are very, very few *tests* for these values.

If we have certain reasons to reserve PCI_BIOS* error to identify PCI errors
in PCI drivers, maybe redefine the PCI_BIOS* to generic error codes can solve
the issues, and no need to call pcibios_err_to_errno() to do the conversion.
Few changes may be made to current codes. One possible patch may
look like below. Otherwise, maybe convert all PCI_BIOS* errors to generic error
codes is a better idea.

Not sure it's the best way or not. Just FYI.


diff --git a/include/linux/pci.h b/include/linux/pci.h
index 83ce1cd..843987c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -675,14 +675,18 @@ static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false;
 
 /* Error values that may be returned by PCI functions */
 #define PCIBIOS_SUCCESSFUL		0x00
-#define PCIBIOS_FUNC_NOT_SUPPORTED	0x81
-#define PCIBIOS_BAD_VENDOR_ID		0x83
-#define PCIBIOS_DEVICE_NOT_FOUND	0x86
-#define PCIBIOS_BAD_REGISTER_NUMBER	0x87
-#define PCIBIOS_SET_FAILED		0x88
-#define PCIBIOS_BUFFER_TOO_SMALL	0x89
-
-/* Translate above to generic errno for passing back through non-PCI code */
+#define PCIBIOS_FUNC_NOT_SUPPORTED	-ENOENT
+#define PCIBIOS_BAD_VENDOR_ID		-ENOTTY
+#define PCIBIOS_DEVICE_NOT_FOUND	-ENODEV
+#define PCIBIOS_BAD_REGISTER_NUMBER	-EFAULT
+#define PCIBIOS_SET_FAILED		-EIO
+#define PCIBIOS_BUFFER_TOO_SMALL	-ENOSPC
+
+/**
+ * Translate above to generic errno for passing back through non-PCI code
+ *
+ * Deprecated. Use the PCIBIOS_* directly without a translation.
+ */
 static inline int pcibios_err_to_errno(int err)
 {
 	if (err <= PCIBIOS_SUCCESSFUL)
@@ -690,17 +694,12 @@ static inline int pcibios_err_to_errno(int err)
 
 	switch (err) {
 	case PCIBIOS_FUNC_NOT_SUPPORTED:
-		return -ENOENT;
 	case PCIBIOS_BAD_VENDOR_ID:
-		return -ENOTTY;
 	case PCIBIOS_DEVICE_NOT_FOUND:
-		return -ENODEV;
 	case PCIBIOS_BAD_REGISTER_NUMBER:
-		return -EFAULT;
 	case PCIBIOS_SET_FAILED:
-		return -EIO;
 	case PCIBIOS_BUFFER_TOO_SMALL:
-		return -ENOSPC;
+		return err;
 	}
 
 	return -ERANGE;


>
> For example, the only tests for PCIBIOS_FUNC_NOT_SUPPORTED are in
> xen_pcibios_err_to_errno() and pcibios_err_to_errno(), i.e., we're
> just converting that value to -ENOENT or the Xen-specific thing.
>
> So I think the best approach might be to remove the PCIBIOS_* error
> values completely and replace them with the corresponding values from
> pcibios_err_to_errno().  For example, a part of the patch would look
> like this:
>
> diff --git a/arch/mips/pci/ops-emma2rh.c b/arch/mips/pci/ops-emma2rh.c
> index 65f47344536c..d4d9c902c147 100644
> --- a/arch/mips/pci/ops-emma2rh.c
> +++ b/arch/mips/pci/ops-emma2rh.c
> @@ -100,7 +100,7 @@ static int pci_config_read(struct pci_bus *bus, unsigned int devfn, int where,
>  		break;
>  	default:
>  		emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
> -		return PCIBIOS_FUNC_NOT_SUPPORTED;
> +		return -ENOENT;
>  	}
>  
>  	emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
> @@ -149,7 +149,7 @@ static int pci_config_write(struct pci_bus *bus, unsigned int devfn, int where,
>  		break;
>  	default:
>  		emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
> -		return PCIBIOS_FUNC_NOT_SUPPORTED;
> +		return -ENOENT;
>  	}
>  	*(volatile u32 *)(base + (PCI_FUNC(devfn) << 8) +
>  			  (where & 0xfffffffc)) = data;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 83ce1cdf5676..f95637a8d391 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -675,7 +675,6 @@ static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false;
>  
>  /* Error values that may be returned by PCI functions */
>  #define PCIBIOS_SUCCESSFUL		0x00
> -#define PCIBIOS_FUNC_NOT_SUPPORTED	0x81
>  #define PCIBIOS_BAD_VENDOR_ID		0x83
>  #define PCIBIOS_DEVICE_NOT_FOUND	0x86
>  #define PCIBIOS_BAD_REGISTER_NUMBER	0x87
> @@ -689,8 +688,6 @@ static inline int pcibios_err_to_errno(int err)
>  		return err; /* Assume already errno */
>  
>  	switch (err) {
> -	case PCIBIOS_FUNC_NOT_SUPPORTED:
> -		return -ENOENT;
>  	case PCIBIOS_BAD_VENDOR_ID:
>  		return -ENOTTY;
>  	case PCIBIOS_DEVICE_NOT_FOUND:
> .
>


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] pci: Make return value of pcie_capability_read*() consistent
  2020-04-28  2:19       ` Yicong Yang
@ 2020-04-28 11:43         ` Bjorn Helgaas
  0 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2020-04-28 11:43 UTC (permalink / raw)
  To: Yicong Yang
  Cc: Saheed Bolarinwa, bjorn, skhan, linux-pci, Thomas Bogendoerfer,
	linux-mips, Michael Ellerman, linuxppc-dev, linux-kernel

On Tue, Apr 28, 2020 at 10:19:08AM +0800, Yicong Yang wrote:
> On 2020/4/28 2:13, Bjorn Helgaas wrote:
> >
> > I'm starting to think we're approaching this backwards.  I searched
> > for PCIBIOS_FUNC_NOT_SUPPORTED, PCIBIOS_BAD_VENDOR_ID, and the other
> > error values.  Almost every use is a *return* in a config accessor.
> > There are very, very few *tests* for these values.
> 
> If we have certain reasons to reserve PCI_BIOS* error to identify
> PCI errors in PCI drivers, maybe redefine the PCI_BIOS* to generic
> error codes can solve the issues, and no need to call
> pcibios_err_to_errno() to do the conversion.  Few changes may be
> made to current codes. One possible patch may look like below.
> Otherwise, maybe convert all PCI_BIOS* errors to generic error codes
> is a better idea.
> 
> Not sure it's the best way or not. Just FYI.

That's a brilliant idea!  We should still look carefully at all the
callers of the config accessors, but this would avoid changing all the
arch accessors, so the patch would be dramatically smaller.

> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 83ce1cd..843987c 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -675,14 +675,18 @@ static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false;
>  
>  /* Error values that may be returned by PCI functions */
>  #define PCIBIOS_SUCCESSFUL		0x00
> -#define PCIBIOS_FUNC_NOT_SUPPORTED	0x81
> -#define PCIBIOS_BAD_VENDOR_ID		0x83
> -#define PCIBIOS_DEVICE_NOT_FOUND	0x86
> -#define PCIBIOS_BAD_REGISTER_NUMBER	0x87
> -#define PCIBIOS_SET_FAILED		0x88
> -#define PCIBIOS_BUFFER_TOO_SMALL	0x89
> -
> -/* Translate above to generic errno for passing back through non-PCI code */
> +#define PCIBIOS_FUNC_NOT_SUPPORTED	-ENOENT
> +#define PCIBIOS_BAD_VENDOR_ID		-ENOTTY
> +#define PCIBIOS_DEVICE_NOT_FOUND	-ENODEV
> +#define PCIBIOS_BAD_REGISTER_NUMBER	-EFAULT
> +#define PCIBIOS_SET_FAILED		-EIO
> +#define PCIBIOS_BUFFER_TOO_SMALL	-ENOSPC
> +
> +/**
> + * Translate above to generic errno for passing back through non-PCI code
> + *
> + * Deprecated. Use the PCIBIOS_* directly without a translation.
> + */
>  static inline int pcibios_err_to_errno(int err)
>  {
>  	if (err <= PCIBIOS_SUCCESSFUL)
> @@ -690,17 +694,12 @@ static inline int pcibios_err_to_errno(int err)
>  
>  	switch (err) {
>  	case PCIBIOS_FUNC_NOT_SUPPORTED:
> -		return -ENOENT;
>  	case PCIBIOS_BAD_VENDOR_ID:
> -		return -ENOTTY;
>  	case PCIBIOS_DEVICE_NOT_FOUND:
> -		return -ENODEV;
>  	case PCIBIOS_BAD_REGISTER_NUMBER:
> -		return -EFAULT;
>  	case PCIBIOS_SET_FAILED:
> -		return -EIO;
>  	case PCIBIOS_BUFFER_TOO_SMALL:
> -		return -ENOSPC;
> +		return err;
>  	}
>  
>  	return -ERANGE;
> 
> > For example, the only tests for PCIBIOS_FUNC_NOT_SUPPORTED are in
> > xen_pcibios_err_to_errno() and pcibios_err_to_errno(), i.e., we're
> > just converting that value to -ENOENT or the Xen-specific thing.
> >
> > So I think the best approach might be to remove the PCIBIOS_* error
> > values completely and replace them with the corresponding values from
> > pcibios_err_to_errno().  For example, a part of the patch would look
> > like this:
> >
> > diff --git a/arch/mips/pci/ops-emma2rh.c b/arch/mips/pci/ops-emma2rh.c
> > index 65f47344536c..d4d9c902c147 100644
> > --- a/arch/mips/pci/ops-emma2rh.c
> > +++ b/arch/mips/pci/ops-emma2rh.c
> > @@ -100,7 +100,7 @@ static int pci_config_read(struct pci_bus *bus, unsigned int devfn, int where,
> >  		break;
> >  	default:
> >  		emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
> > -		return PCIBIOS_FUNC_NOT_SUPPORTED;
> > +		return -ENOENT;
> >  	}
> >  
> >  	emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
> > @@ -149,7 +149,7 @@ static int pci_config_write(struct pci_bus *bus, unsigned int devfn, int where,
> >  		break;
> >  	default:
> >  		emma2rh_out32(EMMA2RH_PCI_IWIN0_CTR, backup_win0);
> > -		return PCIBIOS_FUNC_NOT_SUPPORTED;
> > +		return -ENOENT;
> >  	}
> >  	*(volatile u32 *)(base + (PCI_FUNC(devfn) << 8) +
> >  			  (where & 0xfffffffc)) = data;
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 83ce1cdf5676..f95637a8d391 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -675,7 +675,6 @@ static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false;
> >  
> >  /* Error values that may be returned by PCI functions */
> >  #define PCIBIOS_SUCCESSFUL		0x00
> > -#define PCIBIOS_FUNC_NOT_SUPPORTED	0x81
> >  #define PCIBIOS_BAD_VENDOR_ID		0x83
> >  #define PCIBIOS_DEVICE_NOT_FOUND	0x86
> >  #define PCIBIOS_BAD_REGISTER_NUMBER	0x87
> > @@ -689,8 +688,6 @@ static inline int pcibios_err_to_errno(int err)
> >  		return err; /* Assume already errno */
> >  
> >  	switch (err) {
> > -	case PCIBIOS_FUNC_NOT_SUPPORTED:
> > -		return -ENOENT;
> >  	case PCIBIOS_BAD_VENDOR_ID:
> >  		return -ENOTTY;
> >  	case PCIBIOS_DEVICE_NOT_FOUND:
> > .
> >
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-28 11:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-24 14:27 [PATCH v4] pci: Make return value of pcie_capability_read*() consistent Bolarinwa Olayemi Saheed
2020-04-24 22:30 ` Bjorn Helgaas
2020-04-26  9:51   ` Saheed Bolarinwa
2020-04-27 18:13     ` Bjorn Helgaas
2020-04-28  2:19       ` Yicong Yang
2020-04-28 11:43         ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).