All of lore.kernel.org
 help / color / mirror / Atom feed
* PCI rescan issue
@ 2014-06-19  1:33 Jon Baker
  2014-06-19  4:25 ` Yijing Wang
  2014-06-19  7:14 ` Chen, Tiejun
  0 siblings, 2 replies; 5+ messages in thread
From: Jon Baker @ 2014-06-19  1:33 UTC (permalink / raw)
  To: linux-pci; +Cc: Gogineni, Naveen

I am trying to find a solution to a Linux PCI rescan problem.

I have CentOS 6.5

     [jbaker@server0 ~]$ uname -a
     Linux server0 2.6.32-431.5.1.el6.x86_64 #1 SMP Wed Feb 12 00:41:43 
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

I have an Altera FPGA PCIe protoboard I am trying to test in a TYAN 
FT77A-B7059.

We are trying to solve some hotplug/FPGA reload issues; when we reload 
the FPGA the PCI interface reloads confusing linux pci code.

After bootup the board is seen by the kernel; lscpi shows Altera FPGA 
PCIe protoboard settings and config space.  We can load our driver and 
all is well.

When we reload the FPGA lscpi display shows all "ff" for config space.  
As root I tried

     echo 1 > /sys/bus/pci/rescan

But no change to lspci output.

I tried some variants

     echo 1 > /sys/bus/pci/devices/0000:8a:00.0/rescan

no change to lspci output.

     echo 1 > /sys/bus/pci/devices/0000:8a:00.0/remove
     echo 1 > /sys/bus/pci/rescan

This removed the device and it did not show up again in lscpi.

Kernel has PCI HOTPLUG enabled:

     [jbaker@server0 ~]$ cat /boot/config-2.6.32-431.el6.x86_64 | grep 
HOTPLUG
     CONFIG_HOTPLUG=y
     CONFIG_MEMORY_HOTPLUG=y
     CONFIG_MEMORY_HOTPLUG_SPARSE=y
     CONFIG_HOTPLUG_CPU=y
     CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
     CONFIG_ACPI_HOTPLUG_CPU=y
     CONFIG_ACPI_HOTPLUG_MEMORY=y
     CONFIG_ACPI_HOTPLUG_MEMORY_AUTO_ONLINE=y
     CONFIG_HOTPLUG_PCI_PCIE=y
     CONFIG_HOTPLUG_PCI=y
     CONFIG_HOTPLUG_PCI_FAKE=m
     CONFIG_HOTPLUG_PCI_ACPI=y
     CONFIG_HOTPLUG_PCI_ACPI_IBM=m
     # CONFIG_HOTPLUG_PCI_CPCI is not set
     CONFIG_HOTPLUG_PCI_SHPC=m
     [jbaker@server0 ~]$

The kernel has pciehp module loaded, I see pciehp messages in 
/var/log/messages.

In other distros, on other platforms, echo 1 > /sys/bus/pci/rescan has 
worked but does not appear to work here for CentOS 6.5 on this hardware.

This problem is much discussed on the web but I have yet to see a 
solution discussed. I am thinking the 2.6.32-431.5.1.el6.x86_64 kernel 
may require a patch(es).  The rescan may not be resetting the PCIe card?

Any ideas how to get rescan to work?

Thank you,

Jon Baker

-- 
===================================
Jon Baker
Software Engineer
ViaSat Inc.
Cleveland OH
216-706-7800
jon.baker@viasat.com
www.viasat.com
===================================


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PCI rescan issue
  2014-06-19  1:33 PCI rescan issue Jon Baker
@ 2014-06-19  4:25 ` Yijing Wang
       [not found]   ` <53A2E71D.9060208@viasat.com>
  2014-06-19  7:14 ` Chen, Tiejun
  1 sibling, 1 reply; 5+ messages in thread
From: Yijing Wang @ 2014-06-19  4:25 UTC (permalink / raw)
  To: jon.baker, linux-pci; +Cc: Gogineni, Naveen

On 2014/6/19 9:33, Jon Baker wrote:
> I am trying to find a solution to a Linux PCI rescan problem.
> 
> I have CentOS 6.5
> 
>     [jbaker@server0 ~]$ uname -a
>     Linux server0 2.6.32-431.5.1.el6.x86_64 #1 SMP Wed Feb 12 00:41:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Can you test it in the latest kernel?

> 
> I have an Altera FPGA PCIe protoboard I am trying to test in a TYAN FT77A-B7059.
> 
> We are trying to solve some hotplug/FPGA reload issues; when we reload the FPGA the PCI interface reloads confusing linux pci code.
> 
> After bootup the board is seen by the kernel; lscpi shows Altera FPGA PCIe protoboard settings and config space.  We can load our driver and all is well.
> 
> When we reload the FPGA lscpi display shows all "ff" for config space.  As root I tried

What happened during you reload the FPGA? reset PCI device or other anything?

> 
>     echo 1 > /sys/bus/pci/rescan
> 
> But no change to lspci output.
> 
> I tried some variants
> 
>     echo 1 > /sys/bus/pci/devices/0000:8a:00.0/rescan
> 
> no change to lspci output.
> 
>     echo 1 > /sys/bus/pci/devices/0000:8a:00.0/remove
>     echo 1 > /sys/bus/pci/rescan

I guess the device link to your board was disconnect. Maybe the power to device is off or link error. So OS
can not access pci device config space and return nothing found.

If you provide more detailed info is better. like lspci -vvvxxx, dmesg etc. before and after your operations.

> 
> This removed the device and it did not show up again in lscpi.
> 
> Kernel has PCI HOTPLUG enabled:
> 
>     [jbaker@server0 ~]$ cat /boot/config-2.6.32-431.el6.x86_64 | grep HOTPLUG
>     CONFIG_HOTPLUG=y
>     CONFIG_MEMORY_HOTPLUG=y
>     CONFIG_MEMORY_HOTPLUG_SPARSE=y
>     CONFIG_HOTPLUG_CPU=y
>     CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
>     CONFIG_ACPI_HOTPLUG_CPU=y
>     CONFIG_ACPI_HOTPLUG_MEMORY=y
>     CONFIG_ACPI_HOTPLUG_MEMORY_AUTO_ONLINE=y
>     CONFIG_HOTPLUG_PCI_PCIE=y
>     CONFIG_HOTPLUG_PCI=y
>     CONFIG_HOTPLUG_PCI_FAKE=m
>     CONFIG_HOTPLUG_PCI_ACPI=y
>     CONFIG_HOTPLUG_PCI_ACPI_IBM=m
>     # CONFIG_HOTPLUG_PCI_CPCI is not set
>     CONFIG_HOTPLUG_PCI_SHPC=m
>     [jbaker@server0 ~]$
> 
> The kernel has pciehp module loaded, I see pciehp messages in /var/log/messages.
> 
> In other distros, on other platforms, echo 1 > /sys/bus/pci/rescan has worked but does not appear to work here for CentOS 6.5 on this hardware.
> 
> This problem is much discussed on the web but I have yet to see a solution discussed. I am thinking the 2.6.32-431.5.1.el6.x86_64 kernel may require a patch(es).  The rescan may not be resetting the PCIe card?
> 
> Any ideas how to get rescan to work?
> 
> Thank you,
> 
> Jon Baker
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PCI rescan issue
  2014-06-19  1:33 PCI rescan issue Jon Baker
  2014-06-19  4:25 ` Yijing Wang
@ 2014-06-19  7:14 ` Chen, Tiejun
       [not found]   ` <53A2E395.5010702@viasat.com>
  1 sibling, 1 reply; 5+ messages in thread
From: Chen, Tiejun @ 2014-06-19  7:14 UTC (permalink / raw)
  To: jon.baker, linux-pci; +Cc: Gogineni, Naveen

On 2014/6/19 9:33, Jon Baker wrote:
> I am trying to find a solution to a Linux PCI rescan problem.
>
> I have CentOS 6.5
>
>     [jbaker@server0 ~]$ uname -a
>     Linux server0 2.6.32-431.5.1.el6.x86_64 #1 SMP Wed Feb 12 00:41:43 
> UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
>
> I have an Altera FPGA PCIe protoboard I am trying to test in a TYAN 
> FT77A-B7059.
>
> We are trying to solve some hotplug/FPGA reload issues; when we reload 
> the FPGA the PCI interface reloads confusing linux pci code.
>
> After bootup the board is seen by the kernel; lscpi shows Altera FPGA 
> PCIe protoboard settings and config space.  We can load our driver and 
> all is well.
>
> When we reload the FPGA lscpi display shows all "ff" for config space.  As

Are you saying all values in the config space are 0xff? If yes, this 
mean the vendor/device ids are invalid, so I'm just curious how OS 
identify this device, and you really can see that with lspci?

Tiejun

> root I tried
>
>     echo 1 > /sys/bus/pci/rescan
>
> But no change to lspci output.
>
> I tried some variants
>
>     echo 1 > /sys/bus/pci/devices/0000:8a:00.0/rescan
>
> no change to lspci output.
>
>     echo 1 > /sys/bus/pci/devices/0000:8a:00.0/remove
>     echo 1 > /sys/bus/pci/rescan
>
> This removed the device and it did not show up again in lscpi.
>
> Kernel has PCI HOTPLUG enabled:
>
>     [jbaker@server0 ~]$ cat /boot/config-2.6.32-431.el6.x86_64 | grep 
> HOTPLUG
>     CONFIG_HOTPLUG=y
>     CONFIG_MEMORY_HOTPLUG=y
>     CONFIG_MEMORY_HOTPLUG_SPARSE=y
>     CONFIG_HOTPLUG_CPU=y
>     CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
>     CONFIG_ACPI_HOTPLUG_CPU=y
>     CONFIG_ACPI_HOTPLUG_MEMORY=y
>     CONFIG_ACPI_HOTPLUG_MEMORY_AUTO_ONLINE=y
>     CONFIG_HOTPLUG_PCI_PCIE=y
>     CONFIG_HOTPLUG_PCI=y
>     CONFIG_HOTPLUG_PCI_FAKE=m
>     CONFIG_HOTPLUG_PCI_ACPI=y
>     CONFIG_HOTPLUG_PCI_ACPI_IBM=m
>     # CONFIG_HOTPLUG_PCI_CPCI is not set
>     CONFIG_HOTPLUG_PCI_SHPC=m
>     [jbaker@server0 ~]$
>
> The kernel has pciehp module loaded, I see pciehp messages in 
> /var/log/messages.
>
> In other distros, on other platforms, echo 1 > /sys/bus/pci/rescan has 
> worked but does not appear to work here for CentOS 6.5 on this hardware.
>
> This problem is much discussed on the web but I have yet to see a 
> solution discussed. I am thinking the 2.6.32-431.5.1.el6.x86_64 kernel 
> may require a patch(es).  The rescan may not be resetting the PCIe card?
>
> Any ideas how to get rescan to work?
>
> Thank you,
>
> Jon Baker
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PCI rescan issue
       [not found]   ` <53A2E71D.9060208@viasat.com>
@ 2014-06-20  1:23     ` Yijing Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Yijing Wang @ 2014-06-20  1:23 UTC (permalink / raw)
  To: jon.baker, linux-pci; +Cc: Gogineni, Naveen

On 2014/6/19 21:35, Jon Baker wrote:
> 
> On 06/19/2014 12:25 AM, Yijing Wang wrote:
>> On 2014/6/19 9:33, Jon Baker wrote:
>>> I am trying to find a solution to a Linux PCI rescan problem.
>>>
>>> I have CentOS 6.5
>>>
>>>     [jbaker@server0 ~]$ uname -a
>>>     Linux server0 2.6.32-431.5.1.el6.x86_64 #1 SMP Wed Feb 12 00:41:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
>> Can you test it in the latest kernel?
> 
> Not easily, this is a system shared amongst several developers and containing some specific tools and applications. And unfortunately we know this works with a 3.X kernel on an embedded target used by another group at our company. - Jon
>>
>>> I have an Altera FPGA PCIe protoboard I am trying to test in a TYAN FT77A-B7059.
>>>
>>> We are trying to solve some hotplug/FPGA reload issues; when we reload the FPGA the PCI interface reloads confusing linux pci code.
>>>
>>> After bootup the board is seen by the kernel; lscpi shows Altera FPGA PCIe protoboard settings and config space.  We can load our driver and all is well.
>>>
>>> When we reload the FPGA lscpi display shows all "ff" for config space.  As root I tried
>> What happened during you reload the FPGA? reset PCI device or other anything?
> I do not know, we have not monitored this.  The PCI interface is contained within the FPGA so I don't really know for sure how that behaves on reload. - Jon

If you find lspci PCI device return all ff after FPGA reload, and you do nothing during reload. I think the PCI device can not found after
your FPAG reload, as I mentioned, maybe the PCI device power or link has problems. I think this is not a Linux problem.

You can try this:
1. echo 1 > /sys/bus/pci/devices/0000:8a:00.0/remove
2. Reload FPGA board.
3. echo 1 > /sys/bus/pci/rescan
4. lspci to show your PCI device

Here make your device removed before FPGA board reload, avoid some stall device info in OS.

or

You can go to /sys/bus/pci/slots/ to find whether your PCI device support hotplug.
If yes, you can
echo 0 > /sys/bus/pci/slots/$device/power
echo 1 > /sys/bus/pci/slots/$device/power
reset the PCI device, then lspci show device whether is found.

>>
>>>     echo 1 > /sys/bus/pci/rescan
>>>
>>> But no change to lspci output.
>>>
>>> I tried some variants
>>>
>>>     echo 1 > /sys/bus/pci/devices/0000:8a:00.0/rescan
>>>
>>> no change to lspci output.
>>>
>>>     echo 1 > /sys/bus/pci/devices/0000:8a:00.0/remove
>>>     echo 1 > /sys/bus/pci/rescan
>> I guess the device link to your board was disconnect. Maybe the power to device is off or link error. So OS
>> can not access pci device config space and return nothing found.
>>
>> If you provide more detailed info is better. like lspci -vvvxxx, dmesg etc. before and after your operations.
> 
> I have logged this...
> 
> After system power up (case where everything works expected):
> 
>     $ lspci -vvv -xx -s 13:00.0
>     13:00.0 Non-VGA unclassified device: Device 1b83:f002 (rev 01)
>             Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>             Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>             Latency: 0, Cache Line Size: 64 bytes
>             Interrupt: pin A routed to IRQ 11
>             Region 0: Memory at dcc00000 (32-bit, non-prefetchable) [size=256K]
>             Region 1: Memory at dcb00000 (32-bit, non-prefetchable) [size=1M]
>             Capabilities: <access denied>
>     00: 83 1b 02 f0 06 00 10 00 01 00 00 00 10 00 00 00
>     10: 00 00 c0 dc 00 00 b0 dc 00 00 00 00 00 00 00 00
>     20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>     30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 00 00

Can you show the upstream port device lspci info ? If 13:00.0 device support hotplug, upstream device got a register
to control the slot power.


> 
> 
> After FPGA reload:
> 
>     $ lspci -vvv -xx -s 13:00.0
>     13:00.0 Non-VGA unclassified device: Device 1b83:f002 (rev ff) (prog-if ff)
>             !!! Unknown header type 7f
>     00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>     10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>     20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>     30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 
> 
> - Jon
> 
>>
>>> This removed the device and it did not show up again in lscpi.
>>>
>>> Kernel has PCI HOTPLUG enabled:
>>>
>>>     [jbaker@server0 ~]$ cat /boot/config-2.6.32-431.el6.x86_64 | grep HOTPLUG
>>>     CONFIG_HOTPLUG=y
>>>     CONFIG_MEMORY_HOTPLUG=y
>>>     CONFIG_MEMORY_HOTPLUG_SPARSE=y
>>>     CONFIG_HOTPLUG_CPU=y
>>>     CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
>>>     CONFIG_ACPI_HOTPLUG_CPU=y
>>>     CONFIG_ACPI_HOTPLUG_MEMORY=y
>>>     CONFIG_ACPI_HOTPLUG_MEMORY_AUTO_ONLINE=y
>>>     CONFIG_HOTPLUG_PCI_PCIE=y
>>>     CONFIG_HOTPLUG_PCI=y
>>>     CONFIG_HOTPLUG_PCI_FAKE=m
>>>     CONFIG_HOTPLUG_PCI_ACPI=y
>>>     CONFIG_HOTPLUG_PCI_ACPI_IBM=m
>>>     # CONFIG_HOTPLUG_PCI_CPCI is not set
>>>     CONFIG_HOTPLUG_PCI_SHPC=m
>>>     [jbaker@server0 ~]$
>>>
>>> The kernel has pciehp module loaded, I see pciehp messages in /var/log/messages.
>>>
>>> In other distros, on other platforms, echo 1 > /sys/bus/pci/rescan has worked but does not appear to work here for CentOS 6.5 on this hardware.
>>>
>>> This problem is much discussed on the web but I have yet to see a solution discussed. I am thinking the 2.6.32-431.5.1.el6.x86_64 kernel may require a patch(es).  The rescan may not be resetting the PCIe card?
>>>
>>> Any ideas how to get rescan to work?
>>>
>>> Thank you,
>>>
>>> Jon Baker
>>>
>>
> 
> -- 
> ===================================
> Jon Baker
> Software Engineer
> ViaSat Inc.
> Cleveland OH
> 216-706-7800
> jon.baker@viasat.com
> www.viasat.com
> ===================================
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PCI rescan issue
       [not found]   ` <53A2E395.5010702@viasat.com>
@ 2014-06-20  2:25     ` Chen, Tiejun
  0 siblings, 0 replies; 5+ messages in thread
From: Chen, Tiejun @ 2014-06-20  2:25 UTC (permalink / raw)
  To: jon.baker, linux-pci; +Cc: Gogineni, Naveen


On 2014/6/19 21:20, Jon Baker wrote:
>
> On 06/19/2014 03:14 AM, Chen, Tiejun wrote:
>> On 2014/6/19 9:33, Jon Baker wrote:
>>> I am trying to find a solution to a Linux PCI rescan problem.
>>>
>>> I have CentOS 6.5
>>>
>>>     [jbaker@server0 ~]$ uname -a
>>>     Linux server0 2.6.32-431.5.1.el6.x86_64 #1 SMP Wed Feb 12 
>>> 00:41:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> I have an Altera FPGA PCIe protoboard I am trying to test in a TYAN 
>>> FT77A-B7059.
>>>
>>> We are trying to solve some hotplug/FPGA reload issues; when we 
>>> reload the FPGA the PCI interface reloads confusing linux pci code.
>>>
>>> After bootup the board is seen by the kernel; lscpi shows Altera 
>>> FPGA PCIe protoboard settings and config space.  We can load our 
>>> driver and all is well.
>>>
>>> When we reload the FPGA lscpi display shows all "ff" for config 
>>> space.  As
>>
>> Are you saying all values in the config space are 0xff? If yes, this 
>> mean the vendor/device ids are invalid, so I'm just curious how OS 
>> identify this device, and you really can see that with lspci?
>>
>> Tiejun
> Yes.
>
> After system power up (case where everything works expected):
>
>     $ lspci -vvv -xx -s 13:00.0
>     13:00.0 Non-VGA unclassified device: Device 1b83:f002 (rev 01)
>             Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV-
>     VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>             Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
>     >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>             Latency: 0, Cache Line Size: 64 bytes
>             Interrupt: pin A routed to IRQ 11
>             Region 0: Memory at dcc00000 (32-bit, non-prefetchable)
>     [size=256K]
>             Region 1: Memory at dcb00000 (32-bit, non-prefetchable)
>     [size=1M]
>             Capabilities: <access denied>
>     00: 83 1b 02 f0 06 00 10 00 01 00 00 00 10 00 00 00
>     10: 00 00 c0 dc 00 00 b0 dc 00 00 00 00 00 00 00 00
>     20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>     30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 00 00
>
>
> After FPGA reload:
>
>     $ lspci -vvv -xx -s 13:00.0
>     13:00.0 Non-VGA unclassified device: Device 1b83:f002 (rev ff)
>     (prog-if ff)
>             !!! Unknown header type 7f
>     00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>     10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>     20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>     30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>

Looks you rescan directly with any first remove action. But seems you 
can't find your device again after you remove that firstly. So firstly 
you need to check if the host and device support hotplug before you 
validate hotplug feature.

If rescan manually, you may need to check if the link is established 
before rescan. If yes, probably the FPGA doesn't response any read 
config from bus actually, so all value are 0xff. You can use PCIE 
analyzer to check this.

Tiejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-06-20  2:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-19  1:33 PCI rescan issue Jon Baker
2014-06-19  4:25 ` Yijing Wang
     [not found]   ` <53A2E71D.9060208@viasat.com>
2014-06-20  1:23     ` Yijing Wang
2014-06-19  7:14 ` Chen, Tiejun
     [not found]   ` <53A2E395.5010702@viasat.com>
2014-06-20  2:25     ` Chen, Tiejun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.