So to clarify, you have your system booted with no NVMe endpoint connected, and then when you run the SPDK setup.sh script, you see all of these kernel messages from trying to bind vfio to PCIe devices and system eventually crashes? If so, we need to determine what PCIe devices setup.sh is trying to bind to vfio. It should only be trying to bind NVMe devices but if there is no NVMe device connected then it shouldn’t be trying to bind anything. Can you send lspci –vvvx output from your system before running setup.sh? Thanks, -Jim From: SPDK on behalf of Oza Oza Reply-To: Storage Performance Development Kit Date: Tuesday, August 29, 2017 at 9:45 AM To: Storage Performance Development Kit Subject: Re: [SPDK] PCI hotplug and SPDK In my opinion, this has nothing to do with platform. Though our platform is ARMv8. (but, I can not test on any other, because we don’t know how the kernel driver is written) If kernel driver supports hotplug, which means they are allowing pci_create_root_bus irrespective of whether EP is plugged or not. In other words. Following APIs are never called. pci_stop_root_bus(bus); pci_remove_root_bus(bus); and in that case, if PCIe slots is empty, running SPDK resulting in stalls. (10-15 seconds) followed by crash. Regards, Oza. From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Harris, James R Sent: Tuesday, August 29, 2017 6:20 PM To: Storage Performance Development Kit Subject: Re: [SPDK] PCI hotplug and SPDK Hi Oza, Do you see this issue only on your armv8 platform or do you also see it on amd64? -Jim From: SPDK > on behalf of Oza Oza > Reply-To: Storage Performance Development Kit > Date: Tuesday, August 29, 2017 at 1:51 AM To: Storage Performance Development Kit > Subject: Re: [SPDK] PCI hotplug and SPDK Sorry If I was unclear. I am not talking about hotplug feature of SPDK. > PCI hotplug feature is implemented in kernel driver and working fine. > But the moment I run SPDK and try to bind vfio driver it stalls completely. The reason is: kernel driver will not remove the root bus (when PCIe endpoint is not connected, during boot-time) So SPDK tries to bind driver thinking host bridge is there. Without PCI hotplug host bridge will not be there because of following API call in kernel driver. pci_stop_root_bus(bus); pci_remove_root_bus(bus); > since now we allow host bridge creation (API: pci_create_root_bus) irrespective of EP is connected or not. And then if I run SPDK (with no Endpoint connected/Empty slot) I get stalls. Regards, Oza. From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Chang, Cunyin Sent: Tuesday, August 29, 2017 2:14 PM To: Storage Performance Development Kit Subject: Re: [SPDK] PCI hotplug and SPDK Hi Oza, Could you please provide some details steps to reproduce the issue? SPDK take in charge for hotplug only after you bind the device to uio or vfio driver, so for the new insert deivce, it will handled by kernel driver first. -Cunyin From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Oza Oza Sent: Tuesday, August 29, 2017 4:22 PM To: Storage Performance Development Kit > Subject: [SPDK] PCI hotplug and SPDK Hi All, PCI hotplug support; requires creation of root bus and probe to go ahead with all PCIe configuration. Which means following APIs ae not called. pci_stop_root_bus(bus); pci_remove_root_bus(bus); And then If I run SPDK, It makes system crash with following info. Note: if the disk is connected then SPDK is fine. Otherwise it stalls the system with following crash. root(a)bcm958742k:~# echo 2048 > /proc/sys/vm/nr_hugepages; /usr/share/spdk/scripts/setup.sh grep: /usr/share/spdk/scripts/../include/spdk/pci_ids.h: No such[ 34.621325] pci 0008:00:00.0: PCI bridge to [bus 01] file or directory [ 34.640586] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 50.267056] pci 0000:00:00.0: PCI bridge to [bus 01] [ 50.272337] pci 0001:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 65.898762] pci 0001:00:00.0: PCI bridge to [bus 01] [ 65.904015] pci 0006:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 81.530437] pci 0006:00:00.0: PCI bridge to [bus 01] [ 81.535680] pci 0007:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 97.162103] pci 0007:00:00.0: PCI bridge to [bus 01] [ 97.167255] Bad mode in Error handler detected on CPU6, code 0xbf000002 -- SError [ 97.174974] Internal error: Oops - bad mode: 0 [#1] SMP [ 97.180364] Modules linked in: [ 97.183515] CPU: 6 PID: 2104 Comm: bash Not tainted 4.12.0-01560-gc83093d-dirty #89 [ 97.191413] Hardware name: Stingray Combo SVK w/PCIe IOMMU (BCM958742K) (DT) [ 97.198683] task: ffff80a163a40000 task.stack: ffff80a1612b4000 [ 97.204790] PC is at 0xffff7cbdfba8 [ 97.208387] LR is at 0xffff7cb8f288 [ 97.211983] pc : [<0000ffff7cbdfba8>] lr : [<0000ffff7cb8f288>] pstate: 20000000 [ 97.219612] sp : 0000fffffe564040 [ 97.223029] x29: 0000fffffe564040 x28: 000000001054ce60 [ 97.228509] x27: 0000000000000000 x26: 00000000004e2000 [ 97.233989] x25: 00000000004e5000 x24: 0000000000000002 [ 97.239468] x23: 0000ffff7cc63638 x22: 0000000000000002 [ 97.244947] x21: 0000ffff7cc67480 x20: 000000001054db10 [ 97.250427] x19: 0000000000000002 x18: 0000000000000000 [ 97.255906] x17: 00000000004daac8 x16: 0000000000000000 [ 97.261386] x15: 0000000000000096 x14: 0000000000000000 [ 97.266865] x13: 0000000000000000 x12: 0000000000000000 [ 97.272344] x11: 0000000000000020 x10: 0101010101010101 [ 97.277824] x9 : ffffff80ffffffc8 x8 : 0000000000000040 [ 97.283303] x7 : 0000000000000001 x6 : 0000ffff7cc669f0 [ 97.288782] x5 : 0000000000015551 x4 : 0000000000000888 [ 97.294261] x3 : 0000000000000000 x2 : 0000000000000002 [ 97.299741] x1 : 000000001054db10 x0 : 0000000000000002 [ 97.305220] Process bash (pid: 2104, stack limit = 0xffff80a1612b4000) [ 97.311960] ---[ end trace a1f48abe30820241 ]--- Regards, Oza.