On 2014/12/11 15:35, Yinghai Lu wrote: > On Fri, Dec 5, 2014 at 3:26 PM, tip-bot for Thomas Gleixner > wrote: >> Commit-ID: e9220e591375af6d02604c261999df570fba744f >> Gitweb: http://git.kernel.org/tip/e9220e591375af6d02604c261999df570fba744f >> Author: Thomas Gleixner >> AuthorDate: Fri, 5 Dec 2014 08:48:32 +0000 >> Committer: Thomas Gleixner >> CommitDate: Sat, 6 Dec 2014 00:19:25 +0100 >> >> iommu/vt-d: Move iommu preparatory allocations to irq_remap_ops.prepare >> >> The whole iommu setup for irq remapping is a convoluted mess. The >> iommu detect function gets called from mem_init() and the prepare >> callback gets called from enable_IR_x2apic() for unknown reasons. > > Got > Hi Yinghai, From following log messages, it seems that the AHCI controllers allocates 16 MSI/MSI-X interrupt, and triggers NULL pointer reference when enabling interrupts for AHCI. It doesn't trigger panic with this code path (allocate/enable MSI/MSI-X interrupts with IR enabled) on my test system. So could you please help to get more info with the attached test patch? Thanks! Gerry > [ 134.510969] calling ahci_pci_driver_init+0x0/0x1b @ 1 > [ 134.511387] ahci 0000:00:1f.2: version 3.0 > [ 134.530941] alloc irq_desc for 91 on node 0 > [ 134.531168] alloc irq_desc for 92 on node 0 > [ 134.550728] alloc irq_desc for 93 on node 0 > [ 134.550995] alloc irq_desc for 94 on node 0 > [ 134.551199] alloc irq_desc for 95 on node 0 > [ 134.570871] alloc irq_desc for 96 on node 0 > [ 134.571090] alloc irq_desc for 97 on node 0 > [ 134.571303] alloc irq_desc for 98 on node 0 > [ 134.590974] alloc irq_desc for 99 on node 0 > [ 134.591205] alloc irq_desc for 100 on node 0 > [ 134.610882] alloc irq_desc for 101 on node 0 > [ 134.611136] alloc irq_desc for 102 on node 0 > [ 134.611364] alloc irq_desc for 103 on node 0 > [ 134.630992] alloc irq_desc for 104 on node 0 > [ 134.631232] alloc irq_desc for 105 on node 0 > [ 134.650885] alloc irq_desc for 106 on node 0 > [ 134.651246] ahci 0000:00:1f.2: SSS flag set, parallel bus scan disabled > [ 134.670926] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 > Gbps 0x3f impl SATA mode > [ 134.671349] ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led > clo pio slum part ccc ems sxs > [ 134.691158] ahci 0000:00:1f.2: with iommu 3 : domain 10 > [ 134.751560] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000118 > [ 134.751997] IP: [] modify_irte+0x40/0xd0 > [ 134.770893] PGD 0 > [ 134.771011] Oops: 0000 [#1] SMP > [ 134.771195] Modules linked in: > [ 134.771344] CPU: 0 PID: 2169 Comm: kworker/0:1 Tainted: G W > [ 134.811557] Workqueue: events work_for_cpu_fn > [ 134.830823] task: ffff881024725240 ti: ffff8810252f8000 task.ti: > ffff8810252f8000 > [ 134.831176] RIP: 0010:[] [] > modify_irte+0x40/0xd0 > [ 134.851029] RSP: 0000:ffff8810252fba18 EFLAGS: 00010096 > [ 134.851276] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000be00bd > [ 134.871322] RDX: 0000000000000000 RSI: ffffffff81eafe3f RDI: 0000000000000046 > [ 134.891061] RBP: ffff8810252fba48 R08: 0000000000000001 R09: 0000000000000001 > [ 134.891393] R10: ffff881024725240 R11: 0000000000000292 R12: 0000000000000000 > [ 134.911249] R13: 0000000000000096 R14: ffff881022b181d0 R15: ffff880079268260 > [ 134.930824] FS: 0000000000000000(0000) GS:ffff88103de00000(0000) > knlGS:0000000000000000 > [ 134.931202] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 134.950966] CR2: 0000000000000118 CR3: 0000000002c1a000 CR4: 00000000000007f0 > [ 134.970775] Stack: > [ 134.970883] ffff8810252fba88 0000000000000046 ffff881022f6c660 > ffff881026d00000 > [ 134.971253] 000000000000005c ffff880079268200 ffff8810252fba58 > ffffffff81eaff26 > [ 134.991066] ffff8810252fba78 ffffffff81106761 ffff880079268200 > ffff88103d889400 > [ 135.010908] Call Trace: > [ 135.011038] [] intel_irq_remapping_activate+0x16/0x20 > [ 135.030800] [] irq_domain_activate_irq+0x41/0x50 > [ 135.031103] [] irq_domain_activate_irq+0x2b/0x50 > [ 135.050857] [] irq_startup+0x29/0x70 > [ 135.051091] [] __setup_irq+0x327/0x590 > [ 135.070849] [] ? ahci_bad_pmp_check_ready+0x70/0x70 > [ 135.071143] [] request_threaded_irq+0xf2/0x150 > [ 135.090972] [] ? ahci_bad_pmp_check_ready+0x70/0x70 > [ 135.091295] [] ? ahci_host_activate+0x180/0x180 > [ 135.111014] [] devm_request_threaded_irq+0x5f/0xb0 > [ 135.130804] [] ahci_host_activate+0xa3/0x180 > [ 135.131097] [] ahci_init_one+0x9d1/0xac0 > [ 135.150841] [] local_pci_probe+0x45/0xa0 > [ 135.151127] [] work_for_cpu_fn+0x18/0x30 > [ 135.170843] [] process_one_work+0x254/0x470 > [ 135.171103] [] ? process_one_work+0x1b9/0x470 > [ 135.190846] [] worker_thread+0x31b/0x4e0 > [ 135.191115] [] ? trace_hardirqs_on+0xd/0x10 > [ 135.210920] [] ? pool_mayday_timeout+0x170/0x170 > [ 135.211215] [] kthread+0x101/0x110 > [ 135.230902] [] ? trace_hardirqs_on+0xd/0x10 > [ 135.231157] [] ? kthread_stop+0x100/0x100 > [ 135.250930] [] ret_from_fork+0x7c/0xb0 > [ 135.251178] [] ? kthread_stop+0x100/0x100 > [ 135.270969] Code: ec 10 48 85 ff 0f 84 90 00 00 00 48 c7 c7 80 34 > e0 82 49 89 f6 e8 21 54 16 00 0f b7 53 08 49 89 c5 0f b7 43 0a 4c 8b > 23 8d 1c 02 <49> 8b 84 24 18 01 00 00 48 63 fb 48 c1 e7 04 48 03 38 49 > 8b 06 > [ 135.291699] RIP [] modify_irte+0x40/0xd0 > [ 135.311051] RSP > [ 135.311215] CR2: 0000000000000118 > [ 135.330856] ---[ end trace fee039719f1667df ]--- > [ 135.333024] BUG: unable to handle kernel paging request at ffffffffffffff98 > [ 135.350911] IP: [] kthread_data+0x10/0x20 > [ 135.351230] PGD 2c1b067 PUD 2c1d067 PMD 0 > [ 135.351443] Oops: 0000 [#2] SMP > [ 135.370998] Modules linked in: > [ 135.371168] CPU: 0 PID: 2169 Comm: kworker/0:1 Tainted: G D W > [ 135.412423] task: ffff881024725240 ti: ffff8810252f8000 task.ti: > ffff8810252f8000 > [ 135.412798] RIP: 0010:[] [] > kthread_data+0x10/0x20 > [ 135.431159] RSP: 0000:ffff8810252fb538 EFLAGS: 00010096 > [ 135.450891] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000f > [ 135.451218] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff881024725240 > [ 135.471044] RBP: ffff8810252fb538 R08: ffff8810247252d0 R09: 0000000000000001 > [ 135.490873] R10: ffff881024725240 R11: 000000000000001a R12: ffff88103dfd2c40 > [ 135.491237] R13: 0000000000000000 R14: 0000000000000000 R15: ffff881024725240 > [ 135.511046] FS: 0000000000000000(0000) GS:ffff88103de00000(0000) > knlGS:0000000000000000 > [ 135.530882] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 135.531160] CR2: 0000000000000028 CR3: 0000000002c1a000 CR4: 00000000000007f0 > [ 135.550979] Stack: > [ 135.551074] ffff8810252fb558 ffffffff810bd065 ffff8810252fb558 > ffff881024725240 > [ 135.570927] ffff8810252fb678 ffffffff8200fc0b ffff881025ffec00 > 0000000000009000 > [ 135.571302] ffff881024725240 ffff8810252fbfd8 ffff88103dfd3a40 > ffff881024725240 > [ 135.591114] Call Trace: > [ 135.591231] [] wq_worker_sleeping+0x15/0xb0 > [ 135.610996] [] __schedule+0x18b/0xa70 > [ 135.611237] [] ? trace_hardirqs_on+0xd/0x10 > [ 135.630988] [] ? do_exit+0x88a/0x9f0 > [ 135.631222] [] ? do_exit+0x88a/0x9f0 > [ 135.650932] [] schedule+0x65/0x70 > [ 135.651186] [] do_exit+0x955/0x9f0 > [ 135.670899] [] oops_end+0xb8/0xd0 > [ 135.671136] [] no_context+0x309/0x352 > [ 135.671373] [] __bad_area_nosemaphore+0x1c5/0x1e4 > [ 135.691185] [] bad_area_nosemaphore+0x13/0x15 > [ 135.710934] [] __do_page_fault+0x266/0x590 > [ 135.711292] [] ? task_rq_lock+0x50/0xb0 > [ 135.730941] [] ? task_rq_lock+0x50/0xb0 > [ 135.731200] [] ? _raw_spin_lock+0x62/0x70 > [ 135.750949] [] ? task_rq_lock+0x50/0xb0 > [ 135.751195] [] ? trace_hardirqs_on_caller+0x16/0x260 > [ 135.770989] [] ? trace_hardirqs_off_caller+0x1f/0x160 > [ 135.771309] [] do_page_fault+0x46/0x80 > [ 135.791081] [] page_fault+0x22/0x30 > [ 135.791310] [] ? modify_irte+0x2f/0xd0 > [ 135.811037] [] ? modify_irte+0x40/0xd0 > [ 135.811315] [] ? modify_irte+0x2f/0xd0 > [ 135.831150] [] intel_irq_remapping_activate+0x16/0x20 > [ 135.831461] [] irq_domain_activate_irq+0x41/0x50 > [ 135.851716] [] irq_domain_activate_irq+0x2b/0x50 > [ 135.852020] [] irq_startup+0x29/0x70 > [ 135.871401] [] __setup_irq+0x327/0x590 > [ 135.871653] [] ? ahci_bad_pmp_check_ready+0x70/0x70 > [ 135.891334] [] request_threaded_irq+0xf2/0x150 > [ 135.911099] [] ? ahci_bad_pmp_check_ready+0x70/0x70 > [ 135.911416] [] ? ahci_host_activate+0x180/0x180 > [ 135.931274] [] devm_request_threaded_irq+0x5f/0xb0 > [ 135.931568] [] ahci_host_activate+0xa3/0x180 > [ 135.951093] [] ahci_init_one+0x9d1/0xac0 > [ 135.951375] [] local_pci_probe+0x45/0xa0 > [ 135.971111] [] work_for_cpu_fn+0x18/0x30 > [ 135.971366] [] process_one_work+0x254/0x470 > [ 135.991196] [] ? process_one_work+0x1b9/0x470 > [ 135.991477] [] worker_thread+0x31b/0x4e0 > [ 136.011132] [] ? trace_hardirqs_on+0xd/0x10 > [ 136.011393] [] ? pool_mayday_timeout+0x170/0x170 > [ 136.031187] [] kthread+0x101/0x110 > [ 136.031420] [] ? trace_hardirqs_on+0xd/0x10 > [ 136.051244] [] ? kthread_stop+0x100/0x100 > [ 136.051494] [] ret_from_fork+0x7c/0xb0 > [ 136.071210] [] ? kthread_stop+0x100/0x100 > [ 136.071495] Code: 00 48 89 e5 5d 48 8b 40 88 48 c1 e8 02 83 e0 01 > c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 87 b8 08 00 00 > 55 48 89 e5 <48> 8b 40 98 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 > 66 90 > [ 136.111619] RIP [] kthread_data+0x10/0x20 > [ 136.131069] RSP > [ 136.131253] CR2: ffffffffffffff98 > [ 136.131406] ---[ end trace fee039719f1667e0 ]--- > [ 136.151131] Fixing recursive fault but reboot is needed! > > It is in tip/apic > > Thanks > > Yinghai >