[PATCH] PCI: rockchip: check link status when validating device

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] PCI: rockchip: check link status when validating device
@ 2017-05-19  6:58 Shawn Lin
  2017-05-23 18:00 ` Brian Norris
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Shawn Lin @ 2017-05-19  6:58 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-rockchip, Brian Norris, Jeffy Chen, Shawn Lin

This patch checks the link status before reading and
writing configure space of devices attached to the RC.
If the link status is down, we shouldn't try to access
the devices.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

 drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 0e020b6..1e64227 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
 	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
 }
 
+static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
+{
+	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
+					PCIE_CLIENT_BASIC_STATUS1));
+}
+
 static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
 				      struct pci_bus *bus, int dev)
 {
+	/* do not access the devices if the link isn't completed */
+	if (bus->number != rockchip->root_bus_nr) {
+		if (!rockchip_pcie_link_up(rockchip))
+			return 0;
+	}
+
 	/* access only one slot on each root port */
 	if (bus->number == rockchip->root_bus_nr && dev > 0)
 		return 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-19  6:58 [PATCH] PCI: rockchip: check link status when validating device Shawn Lin
@ 2017-05-23 18:00 ` Brian Norris
  2017-05-24  0:54   ` Shawn Lin
  2017-05-23 18:44 ` Brian Norris
  2017-05-23 19:44 ` [PATCH] PCI: rockchip: check link status when validating device Bjorn Helgaas
  2 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2017-05-23 18:00 UTC (permalink / raw)
  To: Shawn Lin; +Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> This patch checks the link status before reading and
> writing configure space of devices attached to the RC.
> If the link status is down, we shouldn't try to access
> the devices.

I'm curious, in what situations are you seeing the link down? In all the
cases where I can manage to screw up my endpoint and see system aborts
due to config accesses, this check still says the link is up. Presumably
you have some test cases that benefit from this though.

> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
> index 0e020b6..1e64227 100644
> --- a/drivers/pci/host/pcie-rockchip.c
> +++ b/drivers/pci/host/pcie-rockchip.c
> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>  }
>  
> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
> +{
> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
> +					PCIE_CLIENT_BASIC_STATUS1));
> +}
> +
>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>  				      struct pci_bus *bus, int dev)
>  {
> +	/* do not access the devices if the link isn't completed */
> +	if (bus->number != rockchip->root_bus_nr) {
> +		if (!rockchip_pcie_link_up(rockchip))
> +			return 0;
> +	}

Seems like this should be the last check in the function, after you
check that the bus and dev numbers are reasonable?

Brian

> +
>  	/* access only one slot on each root port */
>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>  		return 0;
> -- 
> 1.9.1
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-19  6:58 [PATCH] PCI: rockchip: check link status when validating device Shawn Lin
  2017-05-23 18:00 ` Brian Norris
@ 2017-05-23 18:44 ` Brian Norris
  2017-05-23 19:36     ` Brian Norris
  2017-05-23 19:44 ` [PATCH] PCI: rockchip: check link status when validating device Bjorn Helgaas
  2 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2017-05-23 18:44 UTC (permalink / raw)
  To: Shawn Lin; +Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

I forgot, I had one more comment:

On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> This patch checks the link status before reading and
> writing configure space of devices attached to the RC.
> If the link status is down, we shouldn't try to access
> the devices.
> 
> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
> index 0e020b6..1e64227 100644
> --- a/drivers/pci/host/pcie-rockchip.c
> +++ b/drivers/pci/host/pcie-rockchip.c
> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>  }
>  
> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
> +{
> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
> +					PCIE_CLIENT_BASIC_STATUS1));
> +}
> +
>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>  				      struct pci_bus *bus, int dev)
>  {
> +	/* do not access the devices if the link isn't completed */
> +	if (bus->number != rockchip->root_bus_nr) {
> +		if (!rockchip_pcie_link_up(rockchip))
> +			return 0;

Not exactly a criticism of this patch directly, but the error handling
sequence that this triggers is strange, and I think it's inconsistent.

A 0 result here becomes PCIBIOS_DEVICE_NOT_FOUND for either the read or
write conf helpers. But the high level code doesn't handle this
consistently. See, e.g., pci_read_config_byte() which can return regular
Linux error codes (like -ENODEV), except it also passes up the return
code of pci_read_config_byte() (a PCIBIOS_* code) directly.

So callers don't really know whether to treat the value from
pci_read_config_<foo>() as a PCIBIOS_* code (which should be translated
with pcibios_err_to_errno()) or as a standard Linux errno.

But then, there are relatively few callers (less than 10% of
pci_read_config_<foo>(); even fewer for writes) that actually check the
error codes...

Maybe the "fix" is to replace -ENODEV with PCIBIOS_DEVICE_NOT_FOUND for
the inconsistent cases (pci_{read,write}_config_{byte,word,dword}()).

Brian

> +	}
> +
>  	/* access only one slot on each root port */
>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>  		return 0;
> -- 
> 1.9.1
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] PCI: Make error code types consistent in pci_{read,write}_config_*
  2017-05-23 18:44 ` Brian Norris
@ 2017-05-23 19:36     ` Brian Norris
  0 siblings, 0 replies; 18+ messages in thread
From: Brian Norris @ 2017-05-23 19:36 UTC (permalink / raw)
  To: Shawn Lin, Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, Jeffy Chen, linux-rockchip,
	Krishna Dhulipala, Keith Busch, Christoph Hellwig, Wei Zhang

Callers normally treat the config space accessors as returning PCBIOS_*
error codes, not Linux error codes (or they don't look at them at all).
We have pcibios_err_to_errno(), in case the error code needs translated.

Fixes: 4b1038834739 ("PCI: Don't attempt config access to disconnected devices")
Signed-off-by: Brian Norris <briannorris@chromium.org>
---
+ others, change subject

On Tue, May 23, 2017 at 11:44:01AM -0700, Brian Norris wrote:
> But the high level code doesn't handle this
> consistently. See, e.g., pci_read_config_byte() which can return regular
> Linux error codes (like -ENODEV), except it also passes up the return
> code of pci_read_config_byte() (a PCIBIOS_* code) directly.

Apparently this is new (inconsistent) behavior in 4.12-rc1. Seems like
an oversight to me.

> So callers don't really know whether to treat the value from
> pci_read_config_<foo>() as a PCIBIOS_* code (which should be translated
> with pcibios_err_to_errno()) or as a standard Linux errno.
> 
> But then, there are relatively few callers (less than 10% of
> pci_read_config_<foo>(); even fewer for writes) that actually check the
> error codes...
> 
> Maybe the "fix" is to replace -ENODEV with PCIBIOS_DEVICE_NOT_FOUND for
> the inconsistent cases (pci_{read,write}_config_{byte,word,dword}()).

Fix implemented in the surrounding patch.

 drivers/pci/access.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 74cf5fffb1e1..c80e37a69305 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -896,7 +896,7 @@ int pci_read_config_byte(const struct pci_dev *dev, int where, u8 *val)
 {
 	if (pci_dev_is_disconnected(dev)) {
 		*val = ~0;
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	}
 	return pci_bus_read_config_byte(dev->bus, dev->devfn, where, val);
 }
@@ -906,7 +906,7 @@ int pci_read_config_word(const struct pci_dev *dev, int where, u16 *val)
 {
 	if (pci_dev_is_disconnected(dev)) {
 		*val = ~0;
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	}
 	return pci_bus_read_config_word(dev->bus, dev->devfn, where, val);
 }
@@ -917,7 +917,7 @@ int pci_read_config_dword(const struct pci_dev *dev, int where,
 {
 	if (pci_dev_is_disconnected(dev)) {
 		*val = ~0;
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	}
 	return pci_bus_read_config_dword(dev->bus, dev->devfn, where, val);
 }
@@ -926,7 +926,7 @@ EXPORT_SYMBOL(pci_read_config_dword);
 int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val)
 {
 	if (pci_dev_is_disconnected(dev))
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	return pci_bus_write_config_byte(dev->bus, dev->devfn, where, val);
 }
 EXPORT_SYMBOL(pci_write_config_byte);
@@ -934,7 +934,7 @@ EXPORT_SYMBOL(pci_write_config_byte);
 int pci_write_config_word(const struct pci_dev *dev, int where, u16 val)
 {
 	if (pci_dev_is_disconnected(dev))
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	return pci_bus_write_config_word(dev->bus, dev->devfn, where, val);
 }
 EXPORT_SYMBOL(pci_write_config_word);
@@ -943,7 +943,7 @@ int pci_write_config_dword(const struct pci_dev *dev, int where,
 					 u32 val)
 {
 	if (pci_dev_is_disconnected(dev))
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val);
 }
 EXPORT_SYMBOL(pci_write_config_dword);
-- 
2.13.0.219.gdb65acc882-goog

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] PCI: Make error code types consistent in pci_{read,write}_config_*
@ 2017-05-23 19:36     ` Brian Norris
  0 siblings, 0 replies; 18+ messages in thread
From: Brian Norris @ 2017-05-23 19:36 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, linux-pci, Jeffy Chen, linux-rockchip,
	Krishna Dhulipala, Keith Busch, Christoph Hellwig, Wei Zhang

Callers normally treat the config space accessors as returning PCBIOS_*
error codes, not Linux error codes (or they don't look at them at all).
We have pcibios_err_to_errno(), in case the error code needs translated.

Fixes: 4b1038834739 ("PCI: Don't attempt config access to disconnected devices")
Signed-off-by: Brian Norris <briannorris@chromium.org>
---
+ others, change subject

On Tue, May 23, 2017 at 11:44:01AM -0700, Brian Norris wrote:
> But the high level code doesn't handle this
> consistently. See, e.g., pci_read_config_byte() which can return regular
> Linux error codes (like -ENODEV), except it also passes up the return
> code of pci_read_config_byte() (a PCIBIOS_* code) directly.

Apparently this is new (inconsistent) behavior in 4.12-rc1. Seems like
an oversight to me.

> So callers don't really know whether to treat the value from
> pci_read_config_<foo>() as a PCIBIOS_* code (which should be translated
> with pcibios_err_to_errno()) or as a standard Linux errno.
> 
> But then, there are relatively few callers (less than 10% of
> pci_read_config_<foo>(); even fewer for writes) that actually check the
> error codes...
> 
> Maybe the "fix" is to replace -ENODEV with PCIBIOS_DEVICE_NOT_FOUND for
> the inconsistent cases (pci_{read,write}_config_{byte,word,dword}()).

Fix implemented in the surrounding patch.

 drivers/pci/access.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 74cf5fffb1e1..c80e37a69305 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -896,7 +896,7 @@ int pci_read_config_byte(const struct pci_dev *dev, int where, u8 *val)
 {
 	if (pci_dev_is_disconnected(dev)) {
 		*val = ~0;
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	}
 	return pci_bus_read_config_byte(dev->bus, dev->devfn, where, val);
 }
@@ -906,7 +906,7 @@ int pci_read_config_word(const struct pci_dev *dev, int where, u16 *val)
 {
 	if (pci_dev_is_disconnected(dev)) {
 		*val = ~0;
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	}
 	return pci_bus_read_config_word(dev->bus, dev->devfn, where, val);
 }
@@ -917,7 +917,7 @@ int pci_read_config_dword(const struct pci_dev *dev, int where,
 {
 	if (pci_dev_is_disconnected(dev)) {
 		*val = ~0;
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	}
 	return pci_bus_read_config_dword(dev->bus, dev->devfn, where, val);
 }
@@ -926,7 +926,7 @@ EXPORT_SYMBOL(pci_read_config_dword);
 int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val)
 {
 	if (pci_dev_is_disconnected(dev))
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	return pci_bus_write_config_byte(dev->bus, dev->devfn, where, val);
 }
 EXPORT_SYMBOL(pci_write_config_byte);
@@ -934,7 +934,7 @@ EXPORT_SYMBOL(pci_write_config_byte);
 int pci_write_config_word(const struct pci_dev *dev, int where, u16 val)
 {
 	if (pci_dev_is_disconnected(dev))
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	return pci_bus_write_config_word(dev->bus, dev->devfn, where, val);
 }
 EXPORT_SYMBOL(pci_write_config_word);
@@ -943,7 +943,7 @@ int pci_write_config_dword(const struct pci_dev *dev, int where,
 					 u32 val)
 {
 	if (pci_dev_is_disconnected(dev))
-		return -ENODEV;
+		return PCIBIOS_DEVICE_NOT_FOUND;
 	return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val);
 }
 EXPORT_SYMBOL(pci_write_config_dword);
-- 
2.13.0.219.gdb65acc882-goog

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-19  6:58 [PATCH] PCI: rockchip: check link status when validating device Shawn Lin
  2017-05-23 18:00 ` Brian Norris
  2017-05-23 18:44 ` Brian Norris
@ 2017-05-23 19:44 ` Bjorn Helgaas
  2017-05-24  1:04     ` Shawn Lin
  2 siblings, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2017-05-23 19:44 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Brian Norris, Jeffy Chen

On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> This patch checks the link status before reading and
> writing configure space of devices attached to the RC.
> If the link status is down, we shouldn't try to access
> the devices.

What bad things happen without this patch?

It's racy to check the link status, then do the config access.  The
link might go down after we check but before we can perform the
access.

> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
> index 0e020b6..1e64227 100644
> --- a/drivers/pci/host/pcie-rockchip.c
> +++ b/drivers/pci/host/pcie-rockchip.c
> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>  }
>  
> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
> +{
> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
> +					PCIE_CLIENT_BASIC_STATUS1));
> +}
> +
>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>  				      struct pci_bus *bus, int dev)
>  {
> +	/* do not access the devices if the link isn't completed */
> +	if (bus->number != rockchip->root_bus_nr) {
> +		if (!rockchip_pcie_link_up(rockchip))
> +			return 0;
> +	}
> +
>  	/* access only one slot on each root port */
>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>  		return 0;
> -- 
> 1.9.1
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-23 18:00 ` Brian Norris
@ 2017-05-24  0:54   ` Shawn Lin
  2017-05-24  1:00     ` Brian Norris
  0 siblings, 1 reply; 18+ messages in thread
From: Shawn Lin @ 2017-05-24  0:54 UTC (permalink / raw)
  To: Brian Norris
  Cc: shawn.lin, Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

Hi Brian,

在 2017/5/24 2:00, Brian Norris 写道:
> On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
>> This patch checks the link status before reading and
>> writing configure space of devices attached to the RC.
>> If the link status is down, we shouldn't try to access
>> the devices.
>
> I'm curious, in what situations are you seeing the link down? In all the
> cases where I can manage to screw up my endpoint and see system aborts
> due to config accesses, this check still says the link is up. Presumably
> you have some test cases that benefit from this though.

Of course. This patch doesn't prevent all these cases, for instance,
you do a memory read/write in the EP function driver, since it doesn't
call these two APIs at all.

The reason for me to added this check is that I saw a external abort
down to rockchip_pcie_rd_own_conf, of which I highly suspected was that
the link was re-init or total broken at that time.


>
>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
>> ---
>>
>>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
>> index 0e020b6..1e64227 100644
>> --- a/drivers/pci/host/pcie-rockchip.c
>> +++ b/drivers/pci/host/pcie-rockchip.c
>> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>>  }
>>
>> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
>> +{
>> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
>> +					PCIE_CLIENT_BASIC_STATUS1));
>> +}
>> +
>>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>>  				      struct pci_bus *bus, int dev)
>>  {
>> +	/* do not access the devices if the link isn't completed */
>> +	if (bus->number != rockchip->root_bus_nr) {
>> +		if (!rockchip_pcie_link_up(rockchip))
>> +			return 0;
>> +	}
>
> Seems like this should be the last check in the function, after you
> check that the bus and dev numbers are reasonable?
>

I am fine with either one.


> Brian
>
>> +
>>  	/* access only one slot on each root port */
>>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>>  		return 0;
>> --
>> 1.9.1
>>
>>
>
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-24  0:54   ` Shawn Lin
@ 2017-05-24  1:00     ` Brian Norris
  2017-05-24  1:14       ` Shawn Lin
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2017-05-24  1:00 UTC (permalink / raw)
  To: Shawn Lin; +Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

On Wed, May 24, 2017 at 08:54:14AM +0800, Shawn Lin wrote:
> 在 2017/5/24 2:00, Brian Norris 写道:
> >On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> >>This patch checks the link status before reading and
> >>writing configure space of devices attached to the RC.
> >>If the link status is down, we shouldn't try to access
> >>the devices.
> >
> >I'm curious, in what situations are you seeing the link down? In all the
> >cases where I can manage to screw up my endpoint and see system aborts
> >due to config accesses, this check still says the link is up. Presumably
> >you have some test cases that benefit from this though.

NB: Bjorn asked a similar question in a different form. The underlying
concern though, is that this is racy.

> Of course. This patch doesn't prevent all these cases, for instance,
> you do a memory read/write in the EP function driver, since it doesn't
> call these two APIs at all.

Of course. I'm only talking about config accesses.

> The reason for me to added this check is that I saw a external abort
> down to rockchip_pcie_rd_own_conf, of which I highly suspected was that
> the link was re-init or total broken at that time.

I've seen plenty of aborts in this function as well, but I've verified
that the link was still reported "up" in all the cases I could reproduce.

So, do you "suspect" or did you "prove"? e.g., log cases where this
check actually helps?

And to Bjorn's point: do you know *why* such cases were hit? That would
help to understand if the cases you're worrying about are hopelessly
racy, or if there's some way to ensure synchronization.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
@ 2017-05-24  1:04     ` Shawn Lin
  0 siblings, 0 replies; 18+ messages in thread
From: Shawn Lin @ 2017-05-24  1:04 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: shawn.lin, Bjorn Helgaas, linux-pci, linux-rockchip,
	Brian Norris, Jeffy Chen

Hi Bjorn,

在 2017/5/24 3:44, Bjorn Helgaas 写道:
> On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
>> This patch checks the link status before reading and
>> writing configure space of devices attached to the RC.
>> If the link status is down, we shouldn't try to access
>> the devices.
>
> What bad things happen without this patch?
>
> It's racy to check the link status, then do the config access.  The
> link might go down after we check but before we can perform the
> access.

yes, it cannot fix the issue cleanly. Also we cannot prevent
the access from memory read/write that doesn't call the
pci_read/write_config_foo APIs.

I just thought we could catch one luckly before performing
configure access.

>
>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
>> ---
>>
>>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
>> index 0e020b6..1e64227 100644
>> --- a/drivers/pci/host/pcie-rockchip.c
>> +++ b/drivers/pci/host/pcie-rockchip.c
>> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>>  }
>>
>> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
>> +{
>> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
>> +					PCIE_CLIENT_BASIC_STATUS1));
>> +}
>> +
>>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>>  				      struct pci_bus *bus, int dev)
>>  {
>> +	/* do not access the devices if the link isn't completed */
>> +	if (bus->number != rockchip->root_bus_nr) {
>> +		if (!rockchip_pcie_link_up(rockchip))
>> +			return 0;
>> +	}
>> +
>>  	/* access only one slot on each root port */
>>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>>  		return 0;
>> --
>> 1.9.1
>>
>>
>
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
@ 2017-05-24  1:04     ` Shawn Lin
  0 siblings, 0 replies; 18+ messages in thread
From: Shawn Lin @ 2017-05-24  1:04 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jeffy Chen, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	shawn.lin-TNX95d0MmH7DzftRWevZcw, Brian Norris,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Bjorn Helgaas

Hi Bjorn,

在 2017/5/24 3:44, Bjorn Helgaas 写道:
> On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
>> This patch checks the link status before reading and
>> writing configure space of devices attached to the RC.
>> If the link status is down, we shouldn't try to access
>> the devices.
>
> What bad things happen without this patch?
>
> It's racy to check the link status, then do the config access.  The
> link might go down after we check but before we can perform the
> access.

yes, it cannot fix the issue cleanly. Also we cannot prevent
the access from memory read/write that doesn't call the
pci_read/write_config_foo APIs.

I just thought we could catch one luckly before performing
configure access.

>
>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
>> ---
>>
>>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
>> index 0e020b6..1e64227 100644
>> --- a/drivers/pci/host/pcie-rockchip.c
>> +++ b/drivers/pci/host/pcie-rockchip.c
>> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>>  }
>>
>> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
>> +{
>> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
>> +					PCIE_CLIENT_BASIC_STATUS1));
>> +}
>> +
>>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>>  				      struct pci_bus *bus, int dev)
>>  {
>> +	/* do not access the devices if the link isn't completed */
>> +	if (bus->number != rockchip->root_bus_nr) {
>> +		if (!rockchip_pcie_link_up(rockchip))
>> +			return 0;
>> +	}
>> +
>>  	/* access only one slot on each root port */
>>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>>  		return 0;
>> --
>> 1.9.1
>>
>>
>
>
>


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-24  1:00     ` Brian Norris
@ 2017-05-24  1:14       ` Shawn Lin
  2017-05-24  1:25         ` Brian Norris
  0 siblings, 1 reply; 18+ messages in thread
From: Shawn Lin @ 2017-05-24  1:14 UTC (permalink / raw)
  To: Brian Norris
  Cc: shawn.lin, Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

Hi Brian,

在 2017/5/24 9:00, Brian Norris 写道:
> On Wed, May 24, 2017 at 08:54:14AM +0800, Shawn Lin wrote:
>> 在 2017/5/24 2:00, Brian Norris 写道:
>>> On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
>>>> This patch checks the link status before reading and
>>>> writing configure space of devices attached to the RC.
>>>> If the link status is down, we shouldn't try to access
>>>> the devices.
>>>
>>> I'm curious, in what situations are you seeing the link down? In all the
>>> cases where I can manage to screw up my endpoint and see system aborts
>>> due to config accesses, this check still says the link is up. Presumably
>>> you have some test cases that benefit from this though.
>
> NB: Bjorn asked a similar question in a different form. The underlying
> concern though, is that this is racy.

yes, I saw that.

>
>> Of course. This patch doesn't prevent all these cases, for instance,
>> you do a memory read/write in the EP function driver, since it doesn't
>> call these two APIs at all.
>
> Of course. I'm only talking about config accesses.

okay.

>
>> The reason for me to added this check is that I saw a external abort
>> down to rockchip_pcie_rd_own_conf, of which I highly suspected was that
>> the link was re-init or total broken at that time.
>
> I've seen plenty of aborts in this function as well, but I've verified
> that the link was still reported "up" in all the cases I could reproduce.
>

I think it's reasonable as the link could be retrained automatically if
it's not totaly broken at all. Did you poweroff the endpoint and could
still pass this check?

> So, do you "suspect" or did you "prove"? e.g., log cases where this
> check actually helps?

I was powering off the devices and did a lspci, and saw the log cases
there. I will check this again.

>
> And to Bjorn's point: do you know *why* such cases were hit? That would
> help to understand if the cases you're worrying about are hopelessly
> racy, or if there's some way to ensure synchronization.
>
> Brian
>
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-24  1:04     ` Shawn Lin
  (?)
@ 2017-05-24  1:15     ` Brian Norris
  2017-05-24 21:33       ` Bjorn Helgaas
  -1 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2017-05-24  1:15 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

(Since Shawn didn't quite answer this piece)

On Wed, May 24, 2017 at 09:04:33AM +0800, Shawn Lin wrote:
> 在 2017/5/24 3:44, Bjorn Helgaas 写道:
> >On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> >>This patch checks the link status before reading and
> >>writing configure space of devices attached to the RC.
> >>If the link status is down, we shouldn't try to access
> >>the devices.
> >
> >What bad things happen without this patch?

On this SoC, I've seen this sort of behavior (reading the config space
when the device isn't responding) yield aborts, which panic the system. 

I can't speak exactly for which scenarios Shawn is addressing though. As
I mentioned, this patch doesn't do anything for me so far.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-24  1:14       ` Shawn Lin
@ 2017-05-24  1:25         ` Brian Norris
  0 siblings, 0 replies; 18+ messages in thread
From: Brian Norris @ 2017-05-24  1:25 UTC (permalink / raw)
  To: Shawn Lin; +Cc: Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

On Wed, May 24, 2017 at 09:14:52AM +0800, Shawn Lin wrote:
> 在 2017/5/24 9:00, Brian Norris 写道:
> >On Wed, May 24, 2017 at 08:54:14AM +0800, Shawn Lin wrote:
> >>The reason for me to added this check is that I saw a external abort
> >>down to rockchip_pcie_rd_own_conf, of which I highly suspected was that
> >>the link was re-init or total broken at that time.
> >
> >I've seen plenty of aborts in this function as well, but I've verified
> >that the link was still reported "up" in all the cases I could reproduce.
> >
> 
> I think it's reasonable as the link could be retrained automatically if
> it's not totaly broken at all. Did you poweroff the endpoint and could
> still pass this check?

I don't think I powered it off entirely, but I did try asserting its PD#
pin, which powers of most of the functionality -- enough that it
apparently causes aborts, but doesn't bring the link down.

> >So, do you "suspect" or did you "prove"? e.g., log cases where this
> >check actually helps?
> 
> I was powering off the devices and did a lspci, and saw the log cases
> there. I will check this again.
> 
> >
> >And to Bjorn's point: do you know *why* such cases were hit? That would
> >help to understand if the cases you're worrying about are hopelessly
> >racy, or if there's some way to ensure synchronization.

OK, so you've answered this question: losing power is hopelessly racy.

I guess it's up to Bjorn as to whether this racy check is useful at all
then.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-24  1:15     ` Brian Norris
@ 2017-05-24 21:33       ` Bjorn Helgaas
  2017-05-24 21:43         ` Brian Norris
  0 siblings, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2017-05-24 21:33 UTC (permalink / raw)
  To: Brian Norris
  Cc: Shawn Lin, Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

On Tue, May 23, 2017 at 06:15:07PM -0700, Brian Norris wrote:
> (Since Shawn didn't quite answer this piece)
> 
> On Wed, May 24, 2017 at 09:04:33AM +0800, Shawn Lin wrote:
> > 在 2017/5/24 3:44, Bjorn Helgaas 写道:
> > >On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> > >>This patch checks the link status before reading and
> > >>writing configure space of devices attached to the RC.
> > >>If the link status is down, we shouldn't try to access
> > >>the devices.
> > >
> > >What bad things happen without this patch?
> 
> On this SoC, I've seen this sort of behavior (reading the config space
> when the device isn't responding) yield aborts, which panic the system. 

Trying to read config space of a device that doesn't exist is an
essential part of enumeration, and we expect whatever error occurs at
the hardware level to get turned into 0xffffffff data at the CPU (as
in pci_bus_read_dev_vendor_id()).

Shawn mentioned some issue with memory read/write as well.  I think we
need to sort out how to handle both the config space issue and the
memory issue.  This patch seems like it papers over part of it and
reduces the urgency of finding a real solution.

I'm going to drop this for now, pending a more detailed explanation.

Bjorn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: rockchip: check link status when validating device
  2017-05-24 21:33       ` Bjorn Helgaas
@ 2017-05-24 21:43         ` Brian Norris
  0 siblings, 0 replies; 18+ messages in thread
From: Brian Norris @ 2017-05-24 21:43 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Shawn Lin, Bjorn Helgaas, linux-pci, linux-rockchip, Jeffy Chen

On Wed, May 24, 2017 at 04:33:53PM -0500, Bjorn Helgaas wrote:
> On Tue, May 23, 2017 at 06:15:07PM -0700, Brian Norris wrote:
> > (Since Shawn didn't quite answer this piece)
> > 
> > On Wed, May 24, 2017 at 09:04:33AM +0800, Shawn Lin wrote:
> > > 在 2017/5/24 3:44, Bjorn Helgaas 写道:
> > > >What bad things happen without this patch?
> > 
> > On this SoC, I've seen this sort of behavior (reading the config space
> > when the device isn't responding) yield aborts, which panic the system. 
> 
> Trying to read config space of a device that doesn't exist is an
> essential part of enumeration, and we expect whatever error occurs at
> the hardware level to get turned into 0xffffffff data at the CPU (as
> in pci_bus_read_dev_vendor_id()).

Understood.

> Shawn mentioned some issue with memory read/write as well.  I think we
> need to sort out how to handle both the config space issue and the
> memory issue.

Is the memory space issue actually a problem? I don't see anything in
the spec that says how these should behave when the device is off.

(I mean, it's always nice if there are fewer ways to crash the system.
But I thought mem 0xffffffff was only a convention, not a standard.)

> This patch seems like it papers over part of it and
> reduces the urgency of finding a real solution.

I've been bugging Shawn about this for a while already. It's not clear
there's really a good solution so far, apart from hacking the arch
exception handlers, like you're doing in the imx6 driver:

commit 415b6185c541dc0a21457ff307cdb61950a6eb9f
Author: Lucas Stach <l.stach@pengutronix.de>
Date:   Mon May 22 17:06:30 2017 -0500

    PCI: imx6: Fix config read timeout handling

I don't think the ARM64 maintainers have been fond of adding similar
"hook" code...

> I'm going to drop this for now, pending a more detailed explanation.

Fine with me.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: Make error code types consistent in pci_{read,write}_config_*
  2017-05-23 19:36     ` Brian Norris
  (?)
@ 2017-05-25  6:50     ` Keith Busch
  -1 siblings, 0 replies; 18+ messages in thread
From: Keith Busch @ 2017-05-25  6:50 UTC (permalink / raw)
  To: Brian Norris
  Cc: Shawn Lin, Bjorn Helgaas, linux-pci, Jeffy Chen, linux-rockchip,
	Krishna Dhulipala, Christoph Hellwig, Wei Zhang

On Tue, May 23, 2017 at 12:36:58PM -0700, Brian Norris wrote:
> Callers normally treat the config space accessors as returning PCBIOS_*
> error codes, not Linux error codes (or they don't look at them at all).
> We have pcibios_err_to_errno(), in case the error code needs translated.
> 
> Fixes: 4b1038834739 ("PCI: Don't attempt config access to disconnected devices")
> Signed-off-by: Brian Norris <briannorris@chromium.org>

That looks fine to me.

Reviewed-by: Keith Busch <keith.busch@intel.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: Make error code types consistent in pci_{read,write}_config_*
@ 2017-05-26 21:40       ` Bjorn Helgaas
  0 siblings, 0 replies; 18+ messages in thread
From: Bjorn Helgaas @ 2017-05-26 21:40 UTC (permalink / raw)
  To: Brian Norris
  Cc: Shawn Lin, Bjorn Helgaas, linux-pci, Jeffy Chen, linux-rockchip,
	Krishna Dhulipala, Keith Busch, Christoph Hellwig, Wei Zhang

On Tue, May 23, 2017 at 12:36:58PM -0700, Brian Norris wrote:
> Callers normally treat the config space accessors as returning PCBIOS_*
> error codes, not Linux error codes (or they don't look at them at all).
> We have pcibios_err_to_errno(), in case the error code needs translated.
> 
> Fixes: 4b1038834739 ("PCI: Don't attempt config access to disconnected devices")
> Signed-off-by: Brian Norris <briannorris@chromium.org>

Applied with Keith's reviewed-by to for-linus for v4.12, thanks, Brian!

> ---
> + others, change subject
> 
> On Tue, May 23, 2017 at 11:44:01AM -0700, Brian Norris wrote:
> > But the high level code doesn't handle this
> > consistently. See, e.g., pci_read_config_byte() which can return regular
> > Linux error codes (like -ENODEV), except it also passes up the return
> > code of pci_read_config_byte() (a PCIBIOS_* code) directly.
> 
> Apparently this is new (inconsistent) behavior in 4.12-rc1. Seems like
> an oversight to me.
> 
> > So callers don't really know whether to treat the value from
> > pci_read_config_<foo>() as a PCIBIOS_* code (which should be translated
> > with pcibios_err_to_errno()) or as a standard Linux errno.
> > 
> > But then, there are relatively few callers (less than 10% of
> > pci_read_config_<foo>(); even fewer for writes) that actually check the
> > error codes...
> > 
> > Maybe the "fix" is to replace -ENODEV with PCIBIOS_DEVICE_NOT_FOUND for
> > the inconsistent cases (pci_{read,write}_config_{byte,word,dword}()).
> 
> Fix implemented in the surrounding patch.
> 
>  drivers/pci/access.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 74cf5fffb1e1..c80e37a69305 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -896,7 +896,7 @@ int pci_read_config_byte(const struct pci_dev *dev, int where, u8 *val)
>  {
>  	if (pci_dev_is_disconnected(dev)) {
>  		*val = ~0;
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	}
>  	return pci_bus_read_config_byte(dev->bus, dev->devfn, where, val);
>  }
> @@ -906,7 +906,7 @@ int pci_read_config_word(const struct pci_dev *dev, int where, u16 *val)
>  {
>  	if (pci_dev_is_disconnected(dev)) {
>  		*val = ~0;
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	}
>  	return pci_bus_read_config_word(dev->bus, dev->devfn, where, val);
>  }
> @@ -917,7 +917,7 @@ int pci_read_config_dword(const struct pci_dev *dev, int where,
>  {
>  	if (pci_dev_is_disconnected(dev)) {
>  		*val = ~0;
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	}
>  	return pci_bus_read_config_dword(dev->bus, dev->devfn, where, val);
>  }
> @@ -926,7 +926,7 @@ EXPORT_SYMBOL(pci_read_config_dword);
>  int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val)
>  {
>  	if (pci_dev_is_disconnected(dev))
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	return pci_bus_write_config_byte(dev->bus, dev->devfn, where, val);
>  }
>  EXPORT_SYMBOL(pci_write_config_byte);
> @@ -934,7 +934,7 @@ EXPORT_SYMBOL(pci_write_config_byte);
>  int pci_write_config_word(const struct pci_dev *dev, int where, u16 val)
>  {
>  	if (pci_dev_is_disconnected(dev))
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	return pci_bus_write_config_word(dev->bus, dev->devfn, where, val);
>  }
>  EXPORT_SYMBOL(pci_write_config_word);
> @@ -943,7 +943,7 @@ int pci_write_config_dword(const struct pci_dev *dev, int where,
>  					 u32 val)
>  {
>  	if (pci_dev_is_disconnected(dev))
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val);
>  }
>  EXPORT_SYMBOL(pci_write_config_dword);
> -- 
> 2.13.0.219.gdb65acc882-goog

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI: Make error code types consistent in pci_{read,write}_config_*
@ 2017-05-26 21:40       ` Bjorn Helgaas
  0 siblings, 0 replies; 18+ messages in thread
From: Bjorn Helgaas @ 2017-05-26 21:40 UTC (permalink / raw)
  To: Brian Norris
  Cc: Wei Zhang, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA, Shawn Lin, Jeffy Chen,
	Keith Busch, Krishna Dhulipala, Bjorn Helgaas, Christoph Hellwig

On Tue, May 23, 2017 at 12:36:58PM -0700, Brian Norris wrote:
> Callers normally treat the config space accessors as returning PCBIOS_*
> error codes, not Linux error codes (or they don't look at them at all).
> We have pcibios_err_to_errno(), in case the error code needs translated.
> 
> Fixes: 4b1038834739 ("PCI: Don't attempt config access to disconnected devices")
> Signed-off-by: Brian Norris <briannorris-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

Applied with Keith's reviewed-by to for-linus for v4.12, thanks, Brian!

> ---
> + others, change subject
> 
> On Tue, May 23, 2017 at 11:44:01AM -0700, Brian Norris wrote:
> > But the high level code doesn't handle this
> > consistently. See, e.g., pci_read_config_byte() which can return regular
> > Linux error codes (like -ENODEV), except it also passes up the return
> > code of pci_read_config_byte() (a PCIBIOS_* code) directly.
> 
> Apparently this is new (inconsistent) behavior in 4.12-rc1. Seems like
> an oversight to me.
> 
> > So callers don't really know whether to treat the value from
> > pci_read_config_<foo>() as a PCIBIOS_* code (which should be translated
> > with pcibios_err_to_errno()) or as a standard Linux errno.
> > 
> > But then, there are relatively few callers (less than 10% of
> > pci_read_config_<foo>(); even fewer for writes) that actually check the
> > error codes...
> > 
> > Maybe the "fix" is to replace -ENODEV with PCIBIOS_DEVICE_NOT_FOUND for
> > the inconsistent cases (pci_{read,write}_config_{byte,word,dword}()).
> 
> Fix implemented in the surrounding patch.
> 
>  drivers/pci/access.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 74cf5fffb1e1..c80e37a69305 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -896,7 +896,7 @@ int pci_read_config_byte(const struct pci_dev *dev, int where, u8 *val)
>  {
>  	if (pci_dev_is_disconnected(dev)) {
>  		*val = ~0;
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	}
>  	return pci_bus_read_config_byte(dev->bus, dev->devfn, where, val);
>  }
> @@ -906,7 +906,7 @@ int pci_read_config_word(const struct pci_dev *dev, int where, u16 *val)
>  {
>  	if (pci_dev_is_disconnected(dev)) {
>  		*val = ~0;
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	}
>  	return pci_bus_read_config_word(dev->bus, dev->devfn, where, val);
>  }
> @@ -917,7 +917,7 @@ int pci_read_config_dword(const struct pci_dev *dev, int where,
>  {
>  	if (pci_dev_is_disconnected(dev)) {
>  		*val = ~0;
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	}
>  	return pci_bus_read_config_dword(dev->bus, dev->devfn, where, val);
>  }
> @@ -926,7 +926,7 @@ EXPORT_SYMBOL(pci_read_config_dword);
>  int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val)
>  {
>  	if (pci_dev_is_disconnected(dev))
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	return pci_bus_write_config_byte(dev->bus, dev->devfn, where, val);
>  }
>  EXPORT_SYMBOL(pci_write_config_byte);
> @@ -934,7 +934,7 @@ EXPORT_SYMBOL(pci_write_config_byte);
>  int pci_write_config_word(const struct pci_dev *dev, int where, u16 val)
>  {
>  	if (pci_dev_is_disconnected(dev))
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	return pci_bus_write_config_word(dev->bus, dev->devfn, where, val);
>  }
>  EXPORT_SYMBOL(pci_write_config_word);
> @@ -943,7 +943,7 @@ int pci_write_config_dword(const struct pci_dev *dev, int where,
>  					 u32 val)
>  {
>  	if (pci_dev_is_disconnected(dev))
> -		return -ENODEV;
> +		return PCIBIOS_DEVICE_NOT_FOUND;
>  	return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val);
>  }
>  EXPORT_SYMBOL(pci_write_config_dword);
> -- 
> 2.13.0.219.gdb65acc882-goog

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-05-26 21:40 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-19  6:58 [PATCH] PCI: rockchip: check link status when validating device Shawn Lin
2017-05-23 18:00 ` Brian Norris
2017-05-24  0:54   ` Shawn Lin
2017-05-24  1:00     ` Brian Norris
2017-05-24  1:14       ` Shawn Lin
2017-05-24  1:25         ` Brian Norris
2017-05-23 18:44 ` Brian Norris
2017-05-23 19:36   ` [PATCH] PCI: Make error code types consistent in pci_{read,write}_config_* Brian Norris
2017-05-23 19:36     ` Brian Norris
2017-05-25  6:50     ` Keith Busch
2017-05-26 21:40     ` Bjorn Helgaas
2017-05-26 21:40       ` Bjorn Helgaas
2017-05-23 19:44 ` [PATCH] PCI: rockchip: check link status when validating device Bjorn Helgaas
2017-05-24  1:04   ` Shawn Lin
2017-05-24  1:04     ` Shawn Lin
2017-05-24  1:15     ` Brian Norris
2017-05-24 21:33       ` Bjorn Helgaas
2017-05-24 21:43         ` Brian Norris

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.