linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
@ 2021-02-10  2:05 Qiuxu Zhuo
  2021-02-10  4:33 ` Kelley, Sean V
  2021-02-10 17:12 ` Krzysztof Wilczyński
  0 siblings, 2 replies; 15+ messages in thread
From: Qiuxu Zhuo @ 2021-02-10  2:05 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Qiuxu Zhuo, Sean V Kelley, Luck, Tony, Jin, Wen, linux-pci, linux-kernel

On a Sapphire Rapids server, it failed to inject correctable errors
to the RCiEP device e8:02.0 which was associated with the RCEC device
e8:00.4. See the following error log before applying the patch:

aer-inject -s e8:02.0 examples/correctable
Error: Failed to write, No such device

This was because rcec_assoc_rciep() mistakenly used "rciep->devfn" as
device number to check whether the corresponding bit was set in
the RCiEPBitmap of the RCEC. So that the RCiEP device e8:02.0 wasn't
linked to the RCEC and resulted in the above error.

Fix it by using PCI_SLOT() to convert rciep->devfn to device number.
Ensure that the RCiEP devices associated with the RCEC are linked to
the RCEC as the RCEC is enumerated. After applying the patch, correctable
errors can be injected to the RCiEP successfully.

Reported-and-tested-by: Wen Jin <wen.jin@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
 drivers/pci/pcie/rcec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
index 2c5c552994e4..d0bcd141ac9c 100644
--- a/drivers/pci/pcie/rcec.c
+++ b/drivers/pci/pcie/rcec.c
@@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct pci_dev *rciep)
 
 	/* Same bus, so check bitmap */
 	for_each_set_bit(devn, &bitmap, 32)
-		if (devn == rciep->devfn)
+		if (devn == PCI_SLOT(rciep->devfn))
 			return true;
 
 	return false;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
  2021-02-10  2:05 [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices Qiuxu Zhuo
@ 2021-02-10  4:33 ` Kelley, Sean V
  2021-02-10 17:12 ` Krzysztof Wilczyński
  1 sibling, 0 replies; 15+ messages in thread
From: Kelley, Sean V @ 2021-02-10  4:33 UTC (permalink / raw)
  To: Zhuo, Qiuxu; +Cc: Bjorn Helgaas, Luck, Tony, Jin, Wen, Linux PCI, Linux List



> On Feb 9, 2021, at 6:05 PM, Qiuxu Zhuo <qiuxu.zhuo@intel.com> wrote:
> 
> On a Sapphire Rapids server, it failed to inject correctable errors
> to the RCiEP device e8:02.0 which was associated with the RCEC device
> e8:00.4. See the following error log before applying the patch:
> 
> aer-inject -s e8:02.0 examples/correctable
> Error: Failed to write, No such device
> 
> This was because rcec_assoc_rciep() mistakenly used "rciep->devfn" as
> device number to check whether the corresponding bit was set in
> the RCiEPBitmap of the RCEC. So that the RCiEP device e8:02.0 wasn't
> linked to the RCEC and resulted in the above error.
> 
> Fix it by using PCI_SLOT() to convert rciep->devfn to device number.
> Ensure that the RCiEP devices associated with the RCEC are linked to
> the RCEC as the RCEC is enumerated. After applying the patch, correctable
> errors can be injected to the RCiEP successfully.
> 
> Reported-and-tested-by: Wen Jin <wen.jin@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>

Reviewed-by: Sean V Kelley <sean.v.kelley@intel.com>

Thanks,

Sean

> ---
> drivers/pci/pcie/rcec.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
> index 2c5c552994e4..d0bcd141ac9c 100644
> --- a/drivers/pci/pcie/rcec.c
> +++ b/drivers/pci/pcie/rcec.c
> @@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct pci_dev *rciep)
> 
> 	/* Same bus, so check bitmap */
> 	for_each_set_bit(devn, &bitmap, 32)
> -		if (devn == rciep->devfn)
> +		if (devn == PCI_SLOT(rciep->devfn))
> 			return true;
> 
> 	return false;
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
  2021-02-10  2:05 [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices Qiuxu Zhuo
  2021-02-10  4:33 ` Kelley, Sean V
@ 2021-02-10 17:12 ` Krzysztof Wilczyński
  2021-02-18  3:00   ` Zhuo, Qiuxu
  1 sibling, 1 reply; 15+ messages in thread
From: Krzysztof Wilczyński @ 2021-02-10 17:12 UTC (permalink / raw)
  To: Qiuxu Zhuo
  Cc: Bjorn Helgaas, Sean V Kelley, Luck, Tony, Jin, Wen, linux-pci,
	linux-kernel

Hi Qiuxu,

Nice catch!  Thank you for sending the fix over!

[...]
> On a Sapphire Rapids server, it failed to inject correctable errors
> to the RCiEP device e8:02.0 which was associated with the RCEC device
> e8:00.4. See the following error log before applying the patch:
> 
> aer-inject -s e8:02.0 examples/correctable
> Error: Failed to write, No such device
> 
> This was because rcec_assoc_rciep() mistakenly used "rciep->devfn" as
> device number to check whether the corresponding bit was set in
> the RCiEPBitmap of the RCEC. So that the RCiEP device e8:02.0 wasn't
> linked to the RCEC and resulted in the above error.
> 
> Fix it by using PCI_SLOT() to convert rciep->devfn to device number.
> Ensure that the RCiEP devices associated with the RCEC are linked to
> the RCEC as the RCEC is enumerated. After applying the patch, correctable
> errors can be injected to the RCiEP successfully.

Would this only affect error injection or would this be also a generic
problem with the driver itself causing issues regardless of whether it
was an error injection or not for this particular device?  I am asking,
as there is a lot going on in the commit message.

I wonder if simplifying this commit message so that it clearly explains
what was broken, why, and how this patch is fixing it, would perhaps be
an option?  The backstory of how you found the issue while doing some
testing and error injection is nice, but not sure if needed.

What do you think?

Krzysztof

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
  2021-02-10 17:12 ` Krzysztof Wilczyński
@ 2021-02-18  3:00   ` Zhuo, Qiuxu
  2021-02-18 22:07     ` 'Krzysztof Wilczyński'
  0 siblings, 1 reply; 15+ messages in thread
From: Zhuo, Qiuxu @ 2021-02-18  3:00 UTC (permalink / raw)
  To: 'Krzysztof Wilczyński'
  Cc: Bjorn Helgaas, Kelley, Sean V, Luck, Tony, Jin, Wen, linux-pci,
	linux-kernel

Hi Krzysztof,

Sorry, just back from Chinese New Year holiday.

> From: Krzysztof Wilczyński <kw@linux.com>
> ...
> ...
> Would this only affect error injection or would this be also a generic problem
> with the driver itself causing issues regardless of whether it was an error
> injection or not for this particular device?  I am asking, as there is a lot going on
> in the commit message.

This is also a generic problem.

> I wonder if simplifying this commit message so that it clearly explains what was
> broken, why, and how this patch is fixing it, would perhaps be an option?  The
> backstory of how you found the issue while doing some testing and error
> injection is nice, but not sure if needed.
> 
> What do you think?

Agree to simplify the commit message. How about the following subject and commit message?

Subject:  
Use device number to check RCiEPBitmap of RCEC

Commit message: 
rcec_assoc_rciep() used the combination of device number and function number 'devfn' to check whether the corresponding bit in the RCiEPBimap of RCEC was set. According to [1], it only needs to use the device number to check the corresponding bit in the RCiEPBitmap was set. So fix it by using PCI_SLOT() to convert 'devfn' to device number for rcec_assoc_rciep().
[1] PCIe r5.0, sec "7.9.10.2 Association Bitmap for RCiEPs"


Thanks!
-Qiuxu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
  2021-02-18  3:00   ` Zhuo, Qiuxu
@ 2021-02-18 22:07     ` 'Krzysztof Wilczyński'
  2021-02-18 22:11       ` 'Krzysztof Wilczyński'
  2021-02-19  1:51       ` [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices Zhuo, Qiuxu
  0 siblings, 2 replies; 15+ messages in thread
From: 'Krzysztof Wilczyński' @ 2021-02-18 22:07 UTC (permalink / raw)
  To: Zhuo, Qiuxu
  Cc: Bjorn Helgaas, Kelley, Sean V, Luck, Tony, Jin, Wen, linux-pci,
	linux-kernel

[+cc Bjorn as we talked about RCiEP briefly on IRC]

Hello Qiuxu,

[...]
> Sorry, just back from Chinese New Year holiday.

Welcome back!  I hope you had a nice rest, and also Happy New Year!

[...]
> > Would this only affect error injection or would this be also a generic problem
> > with the driver itself causing issues regardless of whether it was an error
> > injection or not for this particular device?  I am asking, as there is a lot going on
> > in the commit message.
> 
> This is also a generic problem.

Good to know.  Bjorn was also wondering if this is potentially a sign of
a larger probed with the RCiEP support.

> > I wonder if simplifying this commit message so that it clearly explains what was
> > broken, why, and how this patch is fixing it, would perhaps be an option?  The
> > backstory of how you found the issue while doing some testing and error
> > injection is nice, but not sure if needed.
> > 
> > What do you think?
> 
> Agree to simplify the commit message. How about the following subject and commit message?
> 
> Subject:  
> Use device number to check RCiEPBitmap of RCEC
> 
> Commit message: 
> rcec_assoc_rciep() used the combination of device number and function
> number 'devfn' to check whether the corresponding bit in the
> RCiEPBimap of RCEC was set. According to [1], it only needs to use the
> device number to check the corresponding bit in the RCiEPBitmap was
> set. So fix it by using PCI_SLOT() to convert 'devfn' to device number
> for rcec_assoc_rciep(). [1] PCIe r5.0, sec "7.9.10.2 Association
> Bitmap for RCiEPs"

I took your suggestion and came up with the following:

  Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single
  byte encoding the device and function number) as the device number to
  check whether the corresponding bit was set in the RCiEPBitmap of the
  RCEC (Root Complex Event Collector) while enumerating over each bit of
  the RCiEPBitmap.

  As per the PCI Express Base Specification, Revision 5.0, Version 1.0,
  Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to
  use a device number to check whether the corresponding bit was set in
  the RCiEPBitmap.

  Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
  of "rciep->devfn" to a device number to ensure that the RCiEP devices
  are associated with the RCEC are linked when the RCEC is enumerated.

Using either of the following as the subject:

  PCI/RCEC: Use device number to check RCiEPBitmap of RCEC
  PCI/RCEC: Fix RCiEP capable devices RCEC association

What do you think?  Also, feel free to change whatever you see fit, of
course, as tis is only a suggestion.

Krzysztof

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
  2021-02-18 22:07     ` 'Krzysztof Wilczyński'
@ 2021-02-18 22:11       ` 'Krzysztof Wilczyński'
  2021-02-19  1:52         ` Zhuo, Qiuxu
  2021-02-19  1:51       ` [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices Zhuo, Qiuxu
  1 sibling, 1 reply; 15+ messages in thread
From: 'Krzysztof Wilczyński' @ 2021-02-18 22:11 UTC (permalink / raw)
  To: Zhuo, Qiuxu
  Cc: Bjorn Helgaas, Kelley, Sean V, Luck, Tony, Jin, Wen, linux-pci,
	linux-kernel

Hi Qiuxu,

[...]
> > Agree to simplify the commit message. How about the following subject and commit message?
> > 
> > Subject:  
> > Use device number to check RCiEPBitmap of RCEC
> > 
> > Commit message: 
> > rcec_assoc_rciep() used the combination of device number and function
> > number 'devfn' to check whether the corresponding bit in the
> > RCiEPBimap of RCEC was set. According to [1], it only needs to use the
> > device number to check the corresponding bit in the RCiEPBitmap was
> > set. So fix it by using PCI_SLOT() to convert 'devfn' to device number
> > for rcec_assoc_rciep(). [1] PCIe r5.0, sec "7.9.10.2 Association
> > Bitmap for RCiEPs"
> 
> I took your suggestion and came up with the following:
> 
>   Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single
>   byte encoding the device and function number) as the device number to
>   check whether the corresponding bit was set in the RCiEPBitmap of the
>   RCEC (Root Complex Event Collector) while enumerating over each bit of
>   the RCiEPBitmap.
> 
>   As per the PCI Express Base Specification, Revision 5.0, Version 1.0,
>   Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to
>   use a device number to check whether the corresponding bit was set in
>   the RCiEPBitmap.
> 
>   Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
>   of "rciep->devfn" to a device number to ensure that the RCiEP devices
>   are associated with the RCEC are linked when the RCEC is enumerated.
> 
> Using either of the following as the subject:
> 
>   PCI/RCEC: Use device number to check RCiEPBitmap of RCEC
>   PCI/RCEC: Fix RCiEP capable devices RCEC association
> 
> What do you think?  Also, feel free to change whatever you see fit, of
> course, as tis is only a suggestion.

We could probably add the following:

  Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")

Since this would where the issue was originally introduced.  I forgot to
mention this in the previous message, apologies.

Krzysztof

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
  2021-02-18 22:07     ` 'Krzysztof Wilczyński'
  2021-02-18 22:11       ` 'Krzysztof Wilczyński'
@ 2021-02-19  1:51       ` Zhuo, Qiuxu
  1 sibling, 0 replies; 15+ messages in thread
From: Zhuo, Qiuxu @ 2021-02-19  1:51 UTC (permalink / raw)
  To: 'Krzysztof Wilczyński'
  Cc: Bjorn Helgaas, Kelley, Sean V, Luck, Tony, Jin, Wen, linux-pci,
	linux-kernel

> ...
> 
> I took your suggestion and came up with the following:
> 
>   Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single
>   byte encoding the device and function number) as the device number to
>   check whether the corresponding bit was set in the RCiEPBitmap of the
>   RCEC (Root Complex Event Collector) while enumerating over each bit of
>   the RCiEPBitmap.
> 
>   As per the PCI Express Base Specification, Revision 5.0, Version 1.0,
>   Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to
>   use a device number to check whether the corresponding bit was set in
>   the RCiEPBitmap.
> 
>   Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
>   of "rciep->devfn" to a device number to ensure that the RCiEP devices
>   are associated with the RCEC are linked when the RCEC is enumerated.
>
> Using either of the following as the subject:
> 
>   PCI/RCEC: Use device number to check RCiEPBitmap of RCEC
>   PCI/RCEC: Fix RCiEP capable devices RCEC association
> 
> What do you think?  Also, feel free to change whatever you see fit, of course, as
> tis is only a suggestion.
> 

Hi Krzysztof,

Thanks for improving the commit message. It looks clearer. 😊
Will send out a v2 with this commit message.

Thanks!
-Qiuxu


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
  2021-02-18 22:11       ` 'Krzysztof Wilczyński'
@ 2021-02-19  1:52         ` Zhuo, Qiuxu
  2021-02-19  2:23           ` [PATCH v2 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association Qiuxu Zhuo
  0 siblings, 1 reply; 15+ messages in thread
From: Zhuo, Qiuxu @ 2021-02-19  1:52 UTC (permalink / raw)
  To: 'Krzysztof Wilczyński'
  Cc: Bjorn Helgaas, Kelley, Sean V, Luck, Tony, Jin, Wen, linux-pci,
	linux-kernel

>...
> 
> We could probably add the following:
> 
>   Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
> 

OK. Will add this to the v2.

Thanks!
-Qiuxu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
  2021-02-19  1:52         ` Zhuo, Qiuxu
@ 2021-02-19  2:23           ` Qiuxu Zhuo
  2021-02-22  0:56             ` Krzysztof Wilczyński
  0 siblings, 1 reply; 15+ messages in thread
From: Qiuxu Zhuo @ 2021-02-19  2:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Qiuxu Zhuo, Krzysztof Wilczyński, Sean V Kelley, Luck, Tony,
	Jin, Wen, linux-pci, linux-kernel

Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single
byte encoding the device and function number) as the device number to
check whether the corresponding bit was set in the RCiEPBitmap of the
RCEC (Root Complex Event Collector) while enumerating over each bit of
the RCiEPBitmap.

As per the PCI Express Base Specification, Revision 5.0, Version 1.0,
Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to
use a device number to check whether the corresponding bit was set in
the RCiEPBitmap.

Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
of "rciep->devfn" to a device number to ensure that the RCiEP devices
associated with the RCEC are linked when the RCEC is enumerated.

[ Krzysztof: Update commit message. ]

Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
Reported-and-tested-by: Wen Jin <wen.jin@intel.com>
Reviewed-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
v1->v2:
  - Update the subject and the commit message.
  - Add 'Reviewed-by: Sean V Kelley <sean.v.kelley@intel.com>' to the SoB chain.

 drivers/pci/pcie/rcec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
index 2c5c552994e4..d0bcd141ac9c 100644
--- a/drivers/pci/pcie/rcec.c
+++ b/drivers/pci/pcie/rcec.c
@@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct pci_dev *rciep)
 
 	/* Same bus, so check bitmap */
 	for_each_set_bit(devn, &bitmap, 32)
-		if (devn == rciep->devfn)
+		if (devn == PCI_SLOT(rciep->devfn))
 			return true;
 
 	return false;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
  2021-02-19  2:23           ` [PATCH v2 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association Qiuxu Zhuo
@ 2021-02-22  0:56             ` Krzysztof Wilczyński
  2021-02-22  1:04               ` Zhuo, Qiuxu
  0 siblings, 1 reply; 15+ messages in thread
From: Krzysztof Wilczyński @ 2021-02-22  0:56 UTC (permalink / raw)
  To: Qiuxu Zhuo
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Sean V Kelley, Luck, Tony, Jin,
	Wen, linux-pci, linux-kernel

[+cc Lorenzo for visiblity]

Hi,

[...]
> Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
> of "rciep->devfn" to a device number to ensure that the RCiEP devices
> associated with the RCEC are linked when the RCEC is enumerated.
> 
> [ Krzysztof: Update commit message. ]
[...]

Thank you!  I appreciate that.  However, we probably should drop this
from the commit message.  Perhaps either Bjorn or Lorenzo could do it
when applying changes.

Krzysztof

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v2 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
  2021-02-22  0:56             ` Krzysztof Wilczyński
@ 2021-02-22  1:04               ` Zhuo, Qiuxu
  2021-02-22  1:17                 ` [PATCH v3 " Qiuxu Zhuo
  0 siblings, 1 reply; 15+ messages in thread
From: Zhuo, Qiuxu @ 2021-02-22  1:04 UTC (permalink / raw)
  To: 'Krzysztof Wilczyński'
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Kelley, Sean V, Luck, Tony,
	Jin, Wen, linux-pci, linux-kernel

> ...
> > [ Krzysztof: Update commit message. ]
> [...]
> 
> Thank you!  I appreciate that.  However, we probably should drop this from the
> commit message.  Perhaps either Bjorn or Lorenzo could do it when applying
> changes.

OK, will send out the v3 that drops "[ Krzysztof: Update commit message. ]" from the commit message.

-Qiuxu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
  2021-02-22  1:04               ` Zhuo, Qiuxu
@ 2021-02-22  1:17                 ` Qiuxu Zhuo
  2021-03-05  6:12                   ` Zhuo, Qiuxu
  2021-03-10 22:00                   ` Bjorn Helgaas
  0 siblings, 2 replies; 15+ messages in thread
From: Qiuxu Zhuo @ 2021-02-22  1:17 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Qiuxu Zhuo, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Sean V Kelley, Luck, Tony, Jin, Wen, linux-pci, linux-kernel

Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single
byte encoding the device and function number) as the device number to
check whether the corresponding bit was set in the RCiEPBitmap of the
RCEC (Root Complex Event Collector) while enumerating over each bit of
the RCiEPBitmap.

As per the PCI Express Base Specification, Revision 5.0, Version 1.0,
Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to
use a device number to check whether the corresponding bit was set in
the RCiEPBitmap.

Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
of "rciep->devfn" to a device number to ensure that the RCiEP devices
associated with the RCEC are linked when the RCEC is enumerated.

Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
Reported-and-tested-by: Wen Jin <wen.jin@intel.com>
Reviewed-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
v2->v3:
 Drop "[ Krzysztof: Update commit message. ]" from the commit message

 drivers/pci/pcie/rcec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
index 2c5c552994e4..d0bcd141ac9c 100644
--- a/drivers/pci/pcie/rcec.c
+++ b/drivers/pci/pcie/rcec.c
@@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct pci_dev *rciep)
 
 	/* Same bus, so check bitmap */
 	for_each_set_bit(devn, &bitmap, 32)
-		if (devn == rciep->devfn)
+		if (devn == PCI_SLOT(rciep->devfn))
 			return true;
 
 	return false;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
  2021-02-22  1:17                 ` [PATCH v3 " Qiuxu Zhuo
@ 2021-03-05  6:12                   ` Zhuo, Qiuxu
  2021-03-10 22:00                   ` Bjorn Helgaas
  1 sibling, 0 replies; 15+ messages in thread
From: Zhuo, Qiuxu @ 2021-03-05  6:12 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Lorenzo Pieralisi, Krzysztof Wilczyński, Kelley, Sean V,
	Luck, Tony, Jin, Wen, linux-pci, linux-kernel

Hi Bjorn,

Do you have any comments on this patch? If need any changes, please let me know. 
Thanks!

-Qiuxu

> -----Original Message-----
> From: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>
> Sent: Monday, February 22, 2021 9:17 AM
> To: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>; Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com>; Krzysztof Wilczyński <kw@linux.com>; Kelley,
> Sean V <sean.v.kelley@intel.com>; Luck, Tony <tony.luck@intel.com>; Jin, Wen
> <wen.jin@intel.com>; linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
> 
> Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single byte
> encoding the device and function number) as the device number to check
> whether the corresponding bit was set in the RCiEPBitmap of the RCEC (Root
> Complex Event Collector) while enumerating over each bit of the RCiEPBitmap.
> 
> As per the PCI Express Base Specification, Revision 5.0, Version 1.0, Section
> 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to use a device
> number to check whether the corresponding bit was set in the RCiEPBitmap.
> 
> Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value of
> "rciep->devfn" to a device number to ensure that the RCiEP devices associated
> with the RCEC are linked when the RCEC is enumerated.
> 
> Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
> Reported-and-tested-by: Wen Jin <wen.jin@intel.com>
> Reviewed-by: Sean V Kelley <sean.v.kelley@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> ---
> v2->v3:
>  Drop "[ Krzysztof: Update commit message. ]" from the commit message
> 
>  drivers/pci/pcie/rcec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c index
> 2c5c552994e4..d0bcd141ac9c 100644
> --- a/drivers/pci/pcie/rcec.c
> +++ b/drivers/pci/pcie/rcec.c
> @@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct
> pci_dev *rciep)
> 
>  	/* Same bus, so check bitmap */
>  	for_each_set_bit(devn, &bitmap, 32)
> -		if (devn == rciep->devfn)
> +		if (devn == PCI_SLOT(rciep->devfn))
>  			return true;
> 
>  	return false;
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
  2021-02-22  1:17                 ` [PATCH v3 " Qiuxu Zhuo
  2021-03-05  6:12                   ` Zhuo, Qiuxu
@ 2021-03-10 22:00                   ` Bjorn Helgaas
  2021-03-11  3:13                     ` Zhuo, Qiuxu
  1 sibling, 1 reply; 15+ messages in thread
From: Bjorn Helgaas @ 2021-03-10 22:00 UTC (permalink / raw)
  To: Qiuxu Zhuo
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Sean V Kelley, Luck, Tony, Jin, Wen, linux-pci, linux-kernel

On Mon, Feb 22, 2021 at 09:17:17AM +0800, Qiuxu Zhuo wrote:
> Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single
> byte encoding the device and function number) as the device number to
> check whether the corresponding bit was set in the RCiEPBitmap of the
> RCEC (Root Complex Event Collector) while enumerating over each bit of
> the RCiEPBitmap.
> 
> As per the PCI Express Base Specification, Revision 5.0, Version 1.0,
> Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to
> use a device number to check whether the corresponding bit was set in
> the RCiEPBitmap.
> 
> Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
> of "rciep->devfn" to a device number to ensure that the RCiEP devices
> associated with the RCEC are linked when the RCEC is enumerated.
> 
> Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
> Reported-and-tested-by: Wen Jin <wen.jin@intel.com>
> Reviewed-by: Sean V Kelley <sean.v.kelley@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>

I think 507b460f8144 appeared in v5.11, so not something we broke in
v5.12.  Applied to pci/error for v5.13, thanks!

If I understand correctly, we previously only got this right in one
case:

   0 == PCI_SLOT(00.0)    # correct
   1 == PCI_SLOT(00.1)    # incorrect
   2 == PCI_SLOT(00.2)    # incorrect
   ...
   8 == PCI_SLOT(01.0)    # incorrect
   9 == PCI_SLOT(01.1)    # incorrect
   ...
  31 == PCI_SLOT(03.7)    # incorrect

> ---
> v2->v3:
>  Drop "[ Krzysztof: Update commit message. ]" from the commit message
> 
>  drivers/pci/pcie/rcec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
> index 2c5c552994e4..d0bcd141ac9c 100644
> --- a/drivers/pci/pcie/rcec.c
> +++ b/drivers/pci/pcie/rcec.c
> @@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct pci_dev *rciep)
>  
>  	/* Same bus, so check bitmap */
>  	for_each_set_bit(devn, &bitmap, 32)
> -		if (devn == rciep->devfn)
> +		if (devn == PCI_SLOT(rciep->devfn))
>  			return true;
>  
>  	return false;
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
  2021-03-10 22:00                   ` Bjorn Helgaas
@ 2021-03-11  3:13                     ` Zhuo, Qiuxu
  0 siblings, 0 replies; 15+ messages in thread
From: Zhuo, Qiuxu @ 2021-03-11  3:13 UTC (permalink / raw)
  To: 'Bjorn Helgaas'
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Kelley, Sean V, Luck, Tony, Jin, Wen, linux-pci, linux-kernel

> [...]
> 
> I think 507b460f8144 appeared in v5.11, so not something we broke in v5.12.
> Applied to pci/error for v5.13, thanks!

Thanks Bjorn!

> If I understand correctly, we previously only got this right in one
> case:
> 
>    0 == PCI_SLOT(00.0)    # correct
>    1 == PCI_SLOT(00.1)    # incorrect
>    2 == PCI_SLOT(00.2)    # incorrect
>    ...
>    8 == PCI_SLOT(01.0)    # incorrect
>    9 == PCI_SLOT(01.1)    # incorrect
>    ...
>   31 == PCI_SLOT(03.7)    # incorrect

Yes, you're right. 

Thanks!
-Qiuxu


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-03-11  3:14 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-10  2:05 [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices Qiuxu Zhuo
2021-02-10  4:33 ` Kelley, Sean V
2021-02-10 17:12 ` Krzysztof Wilczyński
2021-02-18  3:00   ` Zhuo, Qiuxu
2021-02-18 22:07     ` 'Krzysztof Wilczyński'
2021-02-18 22:11       ` 'Krzysztof Wilczyński'
2021-02-19  1:52         ` Zhuo, Qiuxu
2021-02-19  2:23           ` [PATCH v2 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association Qiuxu Zhuo
2021-02-22  0:56             ` Krzysztof Wilczyński
2021-02-22  1:04               ` Zhuo, Qiuxu
2021-02-22  1:17                 ` [PATCH v3 " Qiuxu Zhuo
2021-03-05  6:12                   ` Zhuo, Qiuxu
2021-03-10 22:00                   ` Bjorn Helgaas
2021-03-11  3:13                     ` Zhuo, Qiuxu
2021-02-19  1:51       ` [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices Zhuo, Qiuxu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).