On Fri, Aug 27, 2021 at 6:01 PM Jason Gunthorpe wrote: > > On Thu, Aug 26, 2021 at 09:15:38PM -0700, Selvin Xavier wrote: > > Following Host crash is observed when pci_enable_atomic_ops_to_root > > is called with VF PCI device. > > > > PID: 4481 TASK: ffff89c6941b0000 CPU: 53 COMMAND: "bash" > > #0 [ffff9a94817136d8] machine_kexec at ffffffffb90601a4 > > #1 [ffff9a9481713728] __crash_kexec at ffffffffb9190d5d > > #2 [ffff9a94817137f0] crash_kexec at ffffffffb9191c4d > > #3 [ffff9a9481713808] oops_end at ffffffffb9025cd6 > > #4 [ffff9a9481713828] page_fault_oops at ffffffffb906e417 > > #5 [ffff9a9481713888] exc_page_fault at ffffffffb9a0ad14 > > #6 [ffff9a94817138b0] asm_exc_page_fault at ffffffffb9c00ace > > [exception RIP: pcie_capability_read_dword+28] > > RIP: ffffffffb952fd5c RSP: ffff9a9481713960 RFLAGS: 00010246 > > RAX: 0000000000000001 RBX: ffff89c6b1096000 RCX: 0000000000000000 > > RDX: ffff9a9481713990 RSI: 0000000000000024 RDI: 0000000000000000 > > RBP: 0000000000000080 R8: 0000000000000008 R9: ffff89c64341a2f8 > > R10: 0000000000000002 R11: 0000000000000000 R12: ffff89c648bab000 > > R13: 0000000000000000 R14: 0000000000000000 R15: ffff89c648bab0c8 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > #7 [ffff9a9481713988] pci_enable_atomic_ops_to_root at ffffffffb95359a6 > > #8 [ffff9a94817139c0] bnxt_qplib_determine_atomics at ffffffffc08c1a33 [bnxt_re] > > #9 [ffff9a94817139d0] bnxt_re_dev_init at ffffffffc08ba2d1 [bnxt_re] > > RIP: 00007f450602f648 RSP: 00007ffe880869e8 RFLAGS: 00000246 > > RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f450602f648 > > RDX: 0000000000000002 RSI: 0000555c566c4a60 RDI: 0000000000000001 > > RBP: 0000555c566c4a60 R8: 000000000000000a R9: 00007f45060c2580 > > R10: 000000000000000a R11: 0000000000000246 R12: 00007f45063026e0 > > R13: 0000000000000002 R14: 00007f45062fd880 R15: 0000000000000002 > > ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b > Apologies for the delay in my response. I was exploring internally to see if it is a specific issue with the adapter/host. I see the problem in multiple systems. > This feels like a bug in pci_enable_atomic_ops_to_root()? I assume it > hit a case where bus->self == NULL? yes. This crashes because of bus->self is NULL. Is it expected for VF? > > Why not fix it there? Since its a functional breakage in 5.14, I posted a quick fix for 5.14. Also, we haven't done any testing on VF for this feature. So I wanted to avoid claiming support for VF anyway. I see that other drivers also use pci_enable_atomic_ops_to_root without vf/pf check. Anyone seeing this issue? > > Jason