linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] pciehp: Wait for link get trained in pci_check_link_status()
@ 2011-11-08  1:13 Yinghai Lu
  2011-11-08  9:19 ` Kenji Kaneshige
  0 siblings, 1 reply; 12+ messages in thread
From: Yinghai Lu @ 2011-11-08  1:13 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Kenji Kaneshige, linux-kernel, linux-pci


Found one PCI Express Modules has link training error after hotplug.
It turns out that after DLLLA is set, LT is still set for a while.
So pciehp will delcare that hotplug fail in 1s.

HW guys say that pciehp is against PCI-e SPEC:
 From PCI Express Base Specification Revision 2.1, Section 6.7.3.3:
 Software must allow 1 second after the Data Link Layer Link Active bit reads 1b
 before it is permitted to determine that a hot plugged device which fails to
 return a Successful Completion for a Valid Configuration Request is a broken
 device (see section 6.6).

Try to wait for long enough by adding LT checking in 1s.

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>

---
 drivers/pci/hotplug/pciehp_hpc.c |   52 +++++++++++++++++++++++++++------------
 1 file changed, 37 insertions(+), 15 deletions(-)

Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
@@ -241,28 +241,49 @@ static int pcie_write_cmd(struct control
 	return retval;
 }
 
-static inline int check_link_active(struct controller *ctrl)
+static inline bool
+check_link_status_bits(struct controller *ctrl, u16 mask, u16 val)
 {
 	u16 link_status;
 
 	if (pciehp_readw(ctrl, PCI_EXP_LNKSTA, &link_status))
-		return 0;
-	return !!(link_status & PCI_EXP_LNKSTA_DLLLA);
+		return false;
+
+	if ((link_status & mask) == val)
+		return true;
+
+	return false;
 }
 
-static void pcie_wait_link_active(struct controller *ctrl)
+static bool
+pcie_wait_link_status_bits(struct controller *ctrl, u16 mask, u16 val)
 {
 	int timeout = 1000;
 
-	if (check_link_active(ctrl))
-		return;
+	if (check_link_status_bits(ctrl, mask, val))
+		return true;
 	while (timeout > 0) {
-		msleep(10);
-		timeout -= 10;
-		if (check_link_active(ctrl))
-			return;
+		msleep(20);
+		timeout -= 20;
+		if (check_link_status_bits(ctrl, mask, val))
+			return true;
 	}
-	ctrl_dbg(ctrl, "Data Link Layer Link Active not set in 1000 msec\n");
+
+	return false;
+}
+
+static void pcie_wait_link_active(struct controller *ctrl)
+{
+	if (!pcie_wait_link_status_bits(ctrl, PCI_EXP_LNKSTA_DLLLA,
+					 PCI_EXP_LNKSTA_DLLLA))
+		ctrl_dbg(ctrl,
+			 "Data Link Layer Link Active not set in 1000 msec\n");
+}
+
+static void pcie_wait_link_training_done(struct controller *ctrl)
+{
+	if (!pcie_wait_link_status_bits(ctrl, PCI_EXP_LNKSTA_LT, 0))
+		ctrl_dbg(ctrl, "Link Training is not done in 1000 msec\n");
 }
 
 int pciehp_check_link_status(struct controller *ctrl)
@@ -275,10 +296,11 @@ int pciehp_check_link_status(struct cont
          * hot-plug capable downstream port. But old controller might
          * not implement it. In this case, we wait for 1000 ms.
          */
-        if (ctrl->link_active_reporting)
-                pcie_wait_link_active(ctrl);
-        else
-                msleep(1000);
+	if (ctrl->link_active_reporting) {
+		pcie_wait_link_active(ctrl);
+		pcie_wait_link_training_done(ctrl);
+	} else
+		msleep(1000);
 
 	retval = pciehp_readw(ctrl, PCI_EXP_LNKSTA, &lnk_status);
 	if (retval) {

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] pciehp: Wait for link get trained in pci_check_link_status()
  2011-11-08  1:13 [RFC PATCH] pciehp: Wait for link get trained in pci_check_link_status() Yinghai Lu
@ 2011-11-08  9:19 ` Kenji Kaneshige
  2011-11-08 15:27   ` Yinghai Lu
  0 siblings, 1 reply; 12+ messages in thread
From: Kenji Kaneshige @ 2011-11-08  9:19 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Jesse Barnes, linux-kernel, linux-pci

(2011/11/08 10:13), Yinghai Lu wrote:
> 
> Found one PCI Express Modules has link training error after hotplug.
> It turns out that after DLLLA is set, LT is still set for a while.
> So pciehp will delcare that hotplug fail in 1s.

I think DLLLA bit reads 1b means LT is completed. So I don't know why LT
is still set on your platform.

> 
> HW guys say that pciehp is against PCI-e SPEC:
>   From PCI Express Base Specification Revision 2.1, Section 6.7.3.3:
>   Software must allow 1 second after the Data Link Layer Link Active bit reads 1b
>   before it is permitted to determine that a hot plugged device which fails to
>   return a Successful Completion for a Valid Configuration Request is a broken
>   device (see section 6.6).
> 
> Try to wait for long enough by adding LT checking in 1s.

The pciehp driver already have this 1 second wait in board_added().
So I still don't understand what in pciehp is against PCIe spec clearly.
Can you explain more about this?

What about the patch below? I think it's much simpler and has less impact.

Regards,
Kenji Kaneshige

---
 drivers/pci/hotplug/pciehp_ctrl.c |    3 ---
 drivers/pci/hotplug/pciehp_hpc.c  |   14 ++++++++++++++
 2 files changed, 14 insertions(+), 3 deletions(-)

Index: linux-3.1/drivers/pci/hotplug/pciehp_ctrl.c
===================================================================
--- linux-3.1.orig/drivers/pci/hotplug/pciehp_ctrl.c
+++ linux-3.1/drivers/pci/hotplug/pciehp_ctrl.c
@@ -213,9 +213,6 @@ static int board_added(struct slot *p_sl
 		goto err_exit;
 	}
 
-	/* Wait for 1 second after checking link training status */
-	msleep(1000);
-
 	/* Check for a power fault */
 	if (ctrl->power_fault_detected || pciehp_query_power_fault(p_slot)) {
 		ctrl_err(ctrl, "Power fault on slot %s\n", slot_name(p_slot));
Index: linux-3.1/drivers/pci/hotplug/pciehp_hpc.c
===================================================================
--- linux-3.1.orig/drivers/pci/hotplug/pciehp_hpc.c
+++ linux-3.1/drivers/pci/hotplug/pciehp_hpc.c
@@ -280,6 +280,20 @@ int pciehp_check_link_status(struct cont
         else
                 msleep(1000);
 
+	/*
+	 * Need to wait for 1 second after the Data Link Layer Link
+	 * Active bit reads 1b before sending configuration request.
+	 * We also need additional 100 ms wait if the downstream port
+	 * supports Link speeds greater than 5.0 GT/s. We place this
+	 * wait before checking Link Training bit because Link
+	 * Training bit still is set even after Data Link Layer Link
+	 * Active bit is set on some platforms.
+	 */
+	if (ctrl->pcie->port->subordinate->max_bus_speed > PCIE_SPEED_5_0GT)
+		msleep(1100);
+	else
+		msleep(1000);
+
 	retval = pciehp_readw(ctrl, PCI_EXP_LNKSTA, &lnk_status);
 	if (retval) {
 		ctrl_err(ctrl, "Cannot read LNKSTATUS register\n");

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] pciehp: Wait for link get trained in pci_check_link_status()
  2011-11-08  9:19 ` Kenji Kaneshige
@ 2011-11-08 15:27   ` Yinghai Lu
  2011-11-08 21:54     ` Yinghai Lu
  0 siblings, 1 reply; 12+ messages in thread
From: Yinghai Lu @ 2011-11-08 15:27 UTC (permalink / raw)
  To: Kenji Kaneshige; +Cc: Jesse Barnes, linux-kernel, linux-pci

On 11/08/2011 01:19 AM, Kenji Kaneshige wrote:

> (2011/11/08 10:13), Yinghai Lu wrote:
>>
>> Found one PCI Express Modules has link training error after hotplug.
>> It turns out that after DLLLA is set, LT is still set for a while.
>> So pciehp will delcare that hotplug fail in 1s.
> 
> I think DLLLA bit reads 1b means LT is completed. So I don't know why LT
> is still set on your platform.


recovery training ?

> 
>>
>> HW guys say that pciehp is against PCI-e SPEC:
>>   From PCI Express Base Specification Revision 2.1, Section 6.7.3.3:
>>   Software must allow 1 second after the Data Link Layer Link Active bit reads 1b
>>   before it is permitted to determine that a hot plugged device which fails to
>>   return a Successful Completion for a Valid Configuration Request is a broken
>>   device (see section 6.6).
>>
>> Try to wait for long enough by adding LT checking in 1s.
> 
> The pciehp driver already have this 1 second wait in board_added().
> So I still don't understand what in pciehp is against PCIe spec clearly.
> Can you explain more about this?


should be fail in 1s after DLLLA, is set.

even we should not check LT, and should try pci conf reading to new added device.

> 
> What about the patch below? I think it's much simpler and has less impact.


Should work. will try it today.

Thanks

Yinghai Lu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] pciehp: Wait for link get trained in pci_check_link_status()
  2011-11-08 15:27   ` Yinghai Lu
@ 2011-11-08 21:54     ` Yinghai Lu
  2011-11-10  7:39       ` Kenji Kaneshige
  0 siblings, 1 reply; 12+ messages in thread
From: Yinghai Lu @ 2011-11-08 21:54 UTC (permalink / raw)
  To: Kenji Kaneshige; +Cc: Jesse Barnes, linux-kernel, linux-pci

On 11/08/2011 07:27 AM, Yinghai Lu wrote:

> On 11/08/2011 01:19 AM, Kenji Kaneshige wrote:
>>
>> What about the patch below? I think it's much simpler and has less impact.
> 
> Should work. will try it today.


yes, that works too.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] pciehp: Wait for link get trained in pci_check_link_status()
  2011-11-08 21:54     ` Yinghai Lu
@ 2011-11-10  7:39       ` Kenji Kaneshige
  2011-11-10  7:40         ` [PATCH 1/2] pciehp: wait 1000 ms before Link Training check Kenji Kaneshige
  2011-11-10  7:42         ` [PATCH 2/2] pciehp: wait 100 ms after " Kenji Kaneshige
  0 siblings, 2 replies; 12+ messages in thread
From: Kenji Kaneshige @ 2011-11-10  7:39 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Jesse Barnes, linux-kernel, linux-pci

(2011/11/09 6:54), Yinghai Lu wrote:
> On 11/08/2011 07:27 AM, Yinghai Lu wrote:
>
>> On 11/08/2011 01:19 AM, Kenji Kaneshige wrote:
>>>
>>> What about the patch below? I think it's much simpler and has less impact.
>>
>> Should work. will try it today.
>
>
> yes, that works too.
>
> Thanks
>
> Yinghai
>

I updated the patch with description. I split it into two and
a logic is a little changed. Could you check these?

Regards,
Kenji Kaneshige


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] pciehp: wait 1000 ms before Link Training check
  2011-11-10  7:39       ` Kenji Kaneshige
@ 2011-11-10  7:40         ` Kenji Kaneshige
  2011-11-10 19:10           ` Yinghai Lu
  2011-11-10  7:42         ` [PATCH 2/2] pciehp: wait 100 ms after " Kenji Kaneshige
  1 sibling, 1 reply; 12+ messages in thread
From: Kenji Kaneshige @ 2011-11-10  7:40 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Jesse Barnes, linux-kernel, linux-pci

We need to wait for 1000 ms after Data Link Layer Link Active (DLLLA)
bit reads 1b before sending configuration request. Currently pciehp
does this wait after checking Link Training (LT) bit. But we need it
before checking LT bit because LT is still set even after DLLLA bit is
set on some platforms.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
---
 drivers/pci/hotplug/pciehp_ctrl.c |    3 ---
 drivers/pci/hotplug/pciehp_hpc.c  |    8 ++++++++
 2 files changed, 8 insertions(+), 3 deletions(-)

Index: linux-3.1/drivers/pci/hotplug/pciehp_ctrl.c
===================================================================
--- linux-3.1.orig/drivers/pci/hotplug/pciehp_ctrl.c
+++ linux-3.1/drivers/pci/hotplug/pciehp_ctrl.c
@@ -213,9 +213,6 @@ static int board_added(struct slot *p_sl
 		goto err_exit;
 	}
 
-	/* Wait for 1 second after checking link training status */
-	msleep(1000);
-
 	/* Check for a power fault */
 	if (ctrl->power_fault_detected || pciehp_query_power_fault(p_slot)) {
 		ctrl_err(ctrl, "Power fault on slot %s\n", slot_name(p_slot));
Index: linux-3.1/drivers/pci/hotplug/pciehp_hpc.c
===================================================================
--- linux-3.1.orig/drivers/pci/hotplug/pciehp_hpc.c
+++ linux-3.1/drivers/pci/hotplug/pciehp_hpc.c
@@ -280,6 +280,14 @@ int pciehp_check_link_status(struct cont
         else
                 msleep(1000);
 
+	/*
+	 * Need to wait for 1000 ms after Data Link Layer Link Active
+	 * (DLLLA) bit reads 1b before sending configuration request.
+	 * We need it before checking Link Training (LT) bit becuase
+	 * LT is still set even after DLLLA bit is set on some platform.
+	 */
+	msleep(1000);
+
 	retval = pciehp_readw(ctrl, PCI_EXP_LNKSTA, &lnk_status);
 	if (retval) {
 		ctrl_err(ctrl, "Cannot read LNKSTATUS register\n");

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] pciehp: wait 100 ms after Link Training check
  2011-11-10  7:39       ` Kenji Kaneshige
  2011-11-10  7:40         ` [PATCH 1/2] pciehp: wait 1000 ms before Link Training check Kenji Kaneshige
@ 2011-11-10  7:42         ` Kenji Kaneshige
  2011-11-10 19:10           ` Yinghai Lu
  1 sibling, 1 reply; 12+ messages in thread
From: Kenji Kaneshige @ 2011-11-10  7:42 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Jesse Barnes, linux-kernel, linux-pci

If the port supports Link speeds greater than 5.0 GT/s, we must wait
for 100 ms after Link training completes before sending configuration
request.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
---
 drivers/pci/hotplug/pciehp_hpc.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Index: linux-3.1/drivers/pci/hotplug/pciehp_hpc.c
===================================================================
--- linux-3.1.orig/drivers/pci/hotplug/pciehp_hpc.c
+++ linux-3.1/drivers/pci/hotplug/pciehp_hpc.c
@@ -302,6 +302,14 @@ int pciehp_check_link_status(struct cont
 		return retval;
 	}
 
+	/*
+	 * If the port supports Link speeds greater than 5.0 GT/s, we
+	 * must wait for 100 ms after Link training completes before
+	 * sending configuration request.
+	 */
+	if (ctrl->pcie->port->subordinate->max_bus_speed > PCIE_SPEED_5_0GT)
+		msleep(100);
+
 	pcie_update_link_speed(ctrl->pcie->port->subordinate, lnk_status);
 
 	return retval;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] pciehp: wait 1000 ms before Link Training check
  2011-11-10  7:40         ` [PATCH 1/2] pciehp: wait 1000 ms before Link Training check Kenji Kaneshige
@ 2011-11-10 19:10           ` Yinghai Lu
  0 siblings, 0 replies; 12+ messages in thread
From: Yinghai Lu @ 2011-11-10 19:10 UTC (permalink / raw)
  To: Kenji Kaneshige; +Cc: Jesse Barnes, linux-kernel, linux-pci

On 11/09/2011 11:40 PM, Kenji Kaneshige wrote:

> We need to wait for 1000 ms after Data Link Layer Link Active (DLLLA)
> bit reads 1b before sending configuration request. Currently pciehp
> does this wait after checking Link Training (LT) bit. But we need it
> before checking LT bit because LT is still set even after DLLLA bit is
> set on some platforms.
> 
> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>


Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] pciehp: wait 100 ms after Link Training check
  2011-11-10  7:42         ` [PATCH 2/2] pciehp: wait 100 ms after " Kenji Kaneshige
@ 2011-11-10 19:10           ` Yinghai Lu
  2011-11-11 17:32             ` Jesse Barnes
  0 siblings, 1 reply; 12+ messages in thread
From: Yinghai Lu @ 2011-11-10 19:10 UTC (permalink / raw)
  To: Kenji Kaneshige; +Cc: Jesse Barnes, linux-kernel, linux-pci

On 11/09/2011 11:42 PM, Kenji Kaneshige wrote:

> If the port supports Link speeds greater than 5.0 GT/s, we must wait
> for 100 ms after Link training completes before sending configuration
> request.
> 
> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>



Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] pciehp: wait 100 ms after Link Training check
  2011-11-10 19:10           ` Yinghai Lu
@ 2011-11-11 17:32             ` Jesse Barnes
  2011-11-11 17:33               ` Yinghai Lu
  0 siblings, 1 reply; 12+ messages in thread
From: Jesse Barnes @ 2011-11-11 17:32 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Kenji Kaneshige, linux-kernel, linux-pci

[-- Attachment #1: Type: text/plain, Size: 592 bytes --]

On Thu, 10 Nov 2011 11:10:56 -0800
Yinghai Lu <yinghai.lu@oracle.com> wrote:

> On 11/09/2011 11:42 PM, Kenji Kaneshige wrote:
> 
> > If the port supports Link speeds greater than 5.0 GT/s, we must wait
> > for 100 ms after Link training completes before sending configuration
> > request.
> > 
> > Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> 
> 
> 
> Acked-by: Yinghai Lu <yinghai@kernel.org>
> Tested-by: Yinghai Lu <yinghai@kernel.org>

Applied these two to my for-linus branch, thanks guys.

-- 
Jesse Barnes, Intel Open Source Technology Center

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] pciehp: wait 100 ms after Link Training check
  2011-11-11 17:32             ` Jesse Barnes
@ 2011-11-11 17:33               ` Yinghai Lu
  2011-11-11 17:40                 ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: Yinghai Lu @ 2011-11-11 17:33 UTC (permalink / raw)
  To: Jesse Barnes, Greg KH; +Cc: Kenji Kaneshige, linux-kernel, linux-pci

On 11/11/2011 09:32 AM, Jesse Barnes wrote:

> On Thu, 10 Nov 2011 11:10:56 -0800
> Yinghai Lu <yinghai.lu@oracle.com> wrote:
> 
>> On 11/09/2011 11:42 PM, Kenji Kaneshige wrote:
>>
>>> If the port supports Link speeds greater than 5.0 GT/s, we must wait
>>> for 100 ms after Link training completes before sending configuration
>>> request.
>>>
>>> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>>
>>
>>
>> Acked-by: Yinghai Lu <yinghai@kernel.org>
>> Tested-by: Yinghai Lu <yinghai@kernel.org>
> 
> Applied these two to my for-linus branch, thanks guys.
> 


Thanks, We may need to put them into stable too.

Yinghai

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] pciehp: wait 100 ms after Link Training check
  2011-11-11 17:33               ` Yinghai Lu
@ 2011-11-11 17:40                 ` Greg KH
  0 siblings, 0 replies; 12+ messages in thread
From: Greg KH @ 2011-11-11 17:40 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Jesse Barnes, Kenji Kaneshige, linux-kernel, linux-pci

On Fri, Nov 11, 2011 at 09:33:32AM -0800, Yinghai Lu wrote:
> On 11/11/2011 09:32 AM, Jesse Barnes wrote:
> 
> > On Thu, 10 Nov 2011 11:10:56 -0800
> > Yinghai Lu <yinghai.lu@oracle.com> wrote:
> > 
> >> On 11/09/2011 11:42 PM, Kenji Kaneshige wrote:
> >>
> >>> If the port supports Link speeds greater than 5.0 GT/s, we must wait
> >>> for 100 ms after Link training completes before sending configuration
> >>> request.
> >>>
> >>> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> >>
> >>
> >>
> >> Acked-by: Yinghai Lu <yinghai@kernel.org>
> >> Tested-by: Yinghai Lu <yinghai@kernel.org>
> > 
> > Applied these two to my for-linus branch, thanks guys.
> > 
> 
> 
> Thanks, We may need to put them into stable too.

Then send stable@vger.kernel.org the git commit ids once they show up in
Linus's tree.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-11-11 17:41 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-08  1:13 [RFC PATCH] pciehp: Wait for link get trained in pci_check_link_status() Yinghai Lu
2011-11-08  9:19 ` Kenji Kaneshige
2011-11-08 15:27   ` Yinghai Lu
2011-11-08 21:54     ` Yinghai Lu
2011-11-10  7:39       ` Kenji Kaneshige
2011-11-10  7:40         ` [PATCH 1/2] pciehp: wait 1000 ms before Link Training check Kenji Kaneshige
2011-11-10 19:10           ` Yinghai Lu
2011-11-10  7:42         ` [PATCH 2/2] pciehp: wait 100 ms after " Kenji Kaneshige
2011-11-10 19:10           ` Yinghai Lu
2011-11-11 17:32             ` Jesse Barnes
2011-11-11 17:33               ` Yinghai Lu
2011-11-11 17:40                 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).