* warning from domain_get_iommu @ 2020-02-04 20:07 ` Jerry Snitselaar 0 siblings, 0 replies; 12+ messages in thread From: Jerry Snitselaar @ 2020-02-04 20:07 UTC (permalink / raw) To: iommu, linux-kernel, Lu Baolu I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 2.832615] ehci-pci: EHCI PCI platform driver [ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller [ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1 [ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 [ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [ 2.840671] Modules linked in: [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 [ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: 0000000000000000 [ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: ffff88ec7f1c8000 [ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: ffff88ec7cbfcd00 [ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: 0000000000000000 [ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: 00000000ffffffff [ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) knlGS:0000000000000000 [ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: 00000000001606b0 [ 2.840671] Call Trace: [ 2.840671] __intel_map_single+0x62/0x140 [ 2.840671] intel_alloc_coherent+0xa6/0x130 [ 2.840671] dma_pool_alloc+0xd8/0x1e0 [ 2.840671] e_qh_alloc+0x55/0x130 [ 2.840671] ehci_setup+0x284/0x7b0 [ 2.840671] ehci_pci_setup+0xa3/0x530 [ 2.840671] usb_add_hcd+0x2b6/0x800 [ 2.840671] usb_hcd_pci_probe+0x375/0x460 [ 2.840671] local_pci_probe+0x41/0x90 [ 2.840671] pci_device_probe+0x105/0x1b0 [ 2.840671] driver_probe_device+0x12d/0x460 [ 2.840671] device_driver_attach+0x50/0x60 [ 2.840671] __driver_attach+0x61/0x130 [ 2.840671] ? device_driver_attach+0x60/0x60 [ 2.840671] bus_for_each_dev+0x77/0xc0 [ 2.840671] ? klist_add_tail+0x3b/0x70 [ 2.840671] bus_add_driver+0x14d/0x1e0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] driver_register+0x6b/0xb0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] do_one_initcall+0x46/0x1c3 [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] kernel_init_freeable+0x1af/0x258 [ 2.840671] ? rest_init+0xaa/0xaa [ 2.840671] kernel_init+0xa/0xf9 [ 2.840671] ret_from_fork+0x35/0x40 [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping [ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity mapping [ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported [ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 [ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 3.035900] usb usb1: Product: EHCI Host Controller [ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry ^ permalink raw reply [flat|nested] 12+ messages in thread
* warning from domain_get_iommu @ 2020-02-04 20:07 ` Jerry Snitselaar 0 siblings, 0 replies; 12+ messages in thread From: Jerry Snitselaar @ 2020-02-04 20:07 UTC (permalink / raw) To: iommu, linux-kernel, Lu Baolu I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 2.832615] ehci-pci: EHCI PCI platform driver [ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller [ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1 [ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 [ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [ 2.840671] Modules linked in: [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 [ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: 0000000000000000 [ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: ffff88ec7f1c8000 [ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: ffff88ec7cbfcd00 [ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: 0000000000000000 [ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: 00000000ffffffff [ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) knlGS:0000000000000000 [ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: 00000000001606b0 [ 2.840671] Call Trace: [ 2.840671] __intel_map_single+0x62/0x140 [ 2.840671] intel_alloc_coherent+0xa6/0x130 [ 2.840671] dma_pool_alloc+0xd8/0x1e0 [ 2.840671] e_qh_alloc+0x55/0x130 [ 2.840671] ehci_setup+0x284/0x7b0 [ 2.840671] ehci_pci_setup+0xa3/0x530 [ 2.840671] usb_add_hcd+0x2b6/0x800 [ 2.840671] usb_hcd_pci_probe+0x375/0x460 [ 2.840671] local_pci_probe+0x41/0x90 [ 2.840671] pci_device_probe+0x105/0x1b0 [ 2.840671] driver_probe_device+0x12d/0x460 [ 2.840671] device_driver_attach+0x50/0x60 [ 2.840671] __driver_attach+0x61/0x130 [ 2.840671] ? device_driver_attach+0x60/0x60 [ 2.840671] bus_for_each_dev+0x77/0xc0 [ 2.840671] ? klist_add_tail+0x3b/0x70 [ 2.840671] bus_add_driver+0x14d/0x1e0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] driver_register+0x6b/0xb0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] do_one_initcall+0x46/0x1c3 [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] kernel_init_freeable+0x1af/0x258 [ 2.840671] ? rest_init+0xaa/0xaa [ 2.840671] kernel_init+0xa/0xf9 [ 2.840671] ret_from_fork+0x35/0x40 [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping [ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity mapping [ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported [ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 [ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 3.035900] usb usb1: Product: EHCI Host Controller [ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu 2020-02-04 20:07 ` Jerry Snitselaar @ 2020-02-06 17:43 ` Jerry Snitselaar -1 siblings, 0 replies; 12+ messages in thread From: Jerry Snitselaar @ 2020-02-06 17:43 UTC (permalink / raw) To: iommu, linux-kernel, Lu Baolu On Tue Feb 04 20, Jerry Snitselaar wrote: >I'm working on getting a system to reproduce this, and verify it also occurs >with 5.5, but I have a report of a case where the kdump kernel gives >warnings like the following on a hp dl360 gen9: > >[ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver >[ 2.832615] ehci-pci: EHCI PCI platform driver >[ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >[ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1 >[ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >[ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >[ 2.840671] Modules linked in: >[ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 >[ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 >[ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >[ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 >[ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >[ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: 0000000000000000 >[ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: ffff88ec7f1c8000 >[ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: ffff88ec7cbfcd00 >[ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: 0000000000000000 >[ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: 00000000ffffffff >[ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >[ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >[ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: 00000000001606b0 >[ 2.840671] Call Trace: >[ 2.840671] __intel_map_single+0x62/0x140 >[ 2.840671] intel_alloc_coherent+0xa6/0x130 >[ 2.840671] dma_pool_alloc+0xd8/0x1e0 >[ 2.840671] e_qh_alloc+0x55/0x130 >[ 2.840671] ehci_setup+0x284/0x7b0 >[ 2.840671] ehci_pci_setup+0xa3/0x530 >[ 2.840671] usb_add_hcd+0x2b6/0x800 >[ 2.840671] usb_hcd_pci_probe+0x375/0x460 >[ 2.840671] local_pci_probe+0x41/0x90 >[ 2.840671] pci_device_probe+0x105/0x1b0 >[ 2.840671] driver_probe_device+0x12d/0x460 >[ 2.840671] device_driver_attach+0x50/0x60 >[ 2.840671] __driver_attach+0x61/0x130 >[ 2.840671] ? device_driver_attach+0x60/0x60 >[ 2.840671] bus_for_each_dev+0x77/0xc0 >[ 2.840671] ? klist_add_tail+0x3b/0x70 >[ 2.840671] bus_add_driver+0x14d/0x1e0 >[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >[ 2.840671] ? do_early_param+0x91/0x91 >[ 2.840671] driver_register+0x6b/0xb0 >[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >[ 2.840671] do_one_initcall+0x46/0x1c3 >[ 2.840671] ? do_early_param+0x91/0x91 >[ 2.840671] kernel_init_freeable+0x1af/0x258 >[ 2.840671] ? rest_init+0xaa/0xaa >[ 2.840671] kernel_init+0xa/0xf9 >[ 2.840671] ret_from_fork+0x35/0x40 >[ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >[ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >[ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity mapping >[ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported >[ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >[ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >[ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 >[ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 >[ 3.035900] usb usb1: Product: EHCI Host Controller >[ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >[ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 > >It looks like the device finishes initializing once it figures out it >needs dma mapping instead of the default >passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >calls __intel_map_single, so I'm not sure why it is tripping over the >WARN_ON in domain_get_iommu. > >one thing I noticed while looking at this is that domain_get_iommu can >return NULL. So should there be something like the following in >__intel_map_single after the domain_get_iommu call? > >if (!iommu) > goto error; > >It is possible to deref the null pointer later otherwise. > >Regards, >Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu @ 2020-02-06 17:43 ` Jerry Snitselaar 0 siblings, 0 replies; 12+ messages in thread From: Jerry Snitselaar @ 2020-02-06 17:43 UTC (permalink / raw) To: iommu, linux-kernel, Lu Baolu On Tue Feb 04 20, Jerry Snitselaar wrote: >I'm working on getting a system to reproduce this, and verify it also occurs >with 5.5, but I have a report of a case where the kdump kernel gives >warnings like the following on a hp dl360 gen9: > >[ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver >[ 2.832615] ehci-pci: EHCI PCI platform driver >[ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >[ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1 >[ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >[ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >[ 2.840671] Modules linked in: >[ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 >[ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 >[ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >[ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 >[ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >[ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: 0000000000000000 >[ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: ffff88ec7f1c8000 >[ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: ffff88ec7cbfcd00 >[ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: 0000000000000000 >[ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: 00000000ffffffff >[ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >[ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >[ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: 00000000001606b0 >[ 2.840671] Call Trace: >[ 2.840671] __intel_map_single+0x62/0x140 >[ 2.840671] intel_alloc_coherent+0xa6/0x130 >[ 2.840671] dma_pool_alloc+0xd8/0x1e0 >[ 2.840671] e_qh_alloc+0x55/0x130 >[ 2.840671] ehci_setup+0x284/0x7b0 >[ 2.840671] ehci_pci_setup+0xa3/0x530 >[ 2.840671] usb_add_hcd+0x2b6/0x800 >[ 2.840671] usb_hcd_pci_probe+0x375/0x460 >[ 2.840671] local_pci_probe+0x41/0x90 >[ 2.840671] pci_device_probe+0x105/0x1b0 >[ 2.840671] driver_probe_device+0x12d/0x460 >[ 2.840671] device_driver_attach+0x50/0x60 >[ 2.840671] __driver_attach+0x61/0x130 >[ 2.840671] ? device_driver_attach+0x60/0x60 >[ 2.840671] bus_for_each_dev+0x77/0xc0 >[ 2.840671] ? klist_add_tail+0x3b/0x70 >[ 2.840671] bus_add_driver+0x14d/0x1e0 >[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >[ 2.840671] ? do_early_param+0x91/0x91 >[ 2.840671] driver_register+0x6b/0xb0 >[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >[ 2.840671] do_one_initcall+0x46/0x1c3 >[ 2.840671] ? do_early_param+0x91/0x91 >[ 2.840671] kernel_init_freeable+0x1af/0x258 >[ 2.840671] ? rest_init+0xaa/0xaa >[ 2.840671] kernel_init+0xa/0xf9 >[ 2.840671] ret_from_fork+0x35/0x40 >[ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >[ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >[ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity mapping >[ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported >[ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >[ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >[ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 >[ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 >[ 3.035900] usb usb1: Product: EHCI Host Controller >[ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >[ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 > >It looks like the device finishes initializing once it figures out it >needs dma mapping instead of the default >passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >calls __intel_map_single, so I'm not sure why it is tripping over the >WARN_ON in domain_get_iommu. > >one thing I noticed while looking at this is that domain_get_iommu can >return NULL. So should there be something like the following in >__intel_map_single after the domain_get_iommu call? > >if (!iommu) > goto error; > >It is possible to deref the null pointer later otherwise. > >Regards, >Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu 2020-02-06 17:43 ` Jerry Snitselaar @ 2020-02-07 9:34 ` Jerry Snitselaar -1 siblings, 0 replies; 12+ messages in thread From: Jerry Snitselaar @ 2020-02-07 9:34 UTC (permalink / raw) To: iommu, linux-kernel, Lu Baolu On Thu Feb 06 20, Jerry Snitselaar wrote: >On Tue Feb 04 20, Jerry Snitselaar wrote: >>I'm working on getting a system to reproduce this, and verify it also occurs >>with 5.5, but I have a report of a case where the kdump kernel gives >>warnings like the following on a hp dl360 gen9: >> >>[ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver >>[ 2.832615] ehci-pci: EHCI PCI platform driver >>[ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>[ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1 >>[ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>[ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>[ 2.840671] Modules linked in: >>[ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 >>[ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 >>[ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>[ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 >>[ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>[ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: 0000000000000000 >>[ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: ffff88ec7f1c8000 >>[ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: ffff88ec7cbfcd00 >>[ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: 0000000000000000 >>[ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: 00000000ffffffff >>[ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >>[ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>[ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: 00000000001606b0 >>[ 2.840671] Call Trace: >>[ 2.840671] __intel_map_single+0x62/0x140 >>[ 2.840671] intel_alloc_coherent+0xa6/0x130 >>[ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>[ 2.840671] e_qh_alloc+0x55/0x130 >>[ 2.840671] ehci_setup+0x284/0x7b0 >>[ 2.840671] ehci_pci_setup+0xa3/0x530 >>[ 2.840671] usb_add_hcd+0x2b6/0x800 >>[ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>[ 2.840671] local_pci_probe+0x41/0x90 >>[ 2.840671] pci_device_probe+0x105/0x1b0 >>[ 2.840671] driver_probe_device+0x12d/0x460 >>[ 2.840671] device_driver_attach+0x50/0x60 >>[ 2.840671] __driver_attach+0x61/0x130 >>[ 2.840671] ? device_driver_attach+0x60/0x60 >>[ 2.840671] bus_for_each_dev+0x77/0xc0 >>[ 2.840671] ? klist_add_tail+0x3b/0x70 >>[ 2.840671] bus_add_driver+0x14d/0x1e0 >>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>[ 2.840671] ? do_early_param+0x91/0x91 >>[ 2.840671] driver_register+0x6b/0xb0 >>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>[ 2.840671] do_one_initcall+0x46/0x1c3 >>[ 2.840671] ? do_early_param+0x91/0x91 >>[ 2.840671] kernel_init_freeable+0x1af/0x258 >>[ 2.840671] ? rest_init+0xaa/0xaa >>[ 2.840671] kernel_init+0xa/0xf9 >>[ 2.840671] ret_from_fork+0x35/0x40 >>[ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>[ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>[ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity mapping >>[ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported >>[ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>[ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>[ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 >>[ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 >>[ 3.035900] usb usb1: Product: EHCI Host Controller >>[ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>[ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >> >>It looks like the device finishes initializing once it figures out it >>needs dma mapping instead of the default >>passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>calls __intel_map_single, so I'm not sure why it is tripping over the >>WARN_ON in domain_get_iommu. >> >>one thing I noticed while looking at this is that domain_get_iommu can >>return NULL. So should there be something like the following in >>__intel_map_single after the domain_get_iommu call? >> >>if (!iommu) >> goto error; >> >>It is possible to deref the null pointer later otherwise. >> >>Regards, >>Jerry > >I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and it calls deferred_attach_domain, which sets the domain to the group domain, which in this case is the identity domain. Then it calls domain_get_iommu, which spits out the warning because the domain type was dma and returns null. My workaround was to add a call to iommu_need_mapping and find_domain after the deferred_attach_domain, but I don't know if that is the correct solution. There are a couple other spots like intel_map_sg that have the deferred_attach_domain after iommu_need_mapping that possibly will suffer from the same problem. diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b5c5ab58d395..063f45323cfc 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr, if (!domain) return DMA_MAPPING_ERROR; + if (!iommu_need_mapping(dev)) + return paddr; + + domain = find_domain(dev); iommu = domain_get_iommu(domain); size = aligned_nrpages(paddr, size); I finally got a git repo over to one of these systems, and was able to reproduce the issue with the head of linus's tree. With commit 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity domain") there are more of the warnings, because devices are using identity that weren't before. Regards, Jerry ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu @ 2020-02-07 9:34 ` Jerry Snitselaar 0 siblings, 0 replies; 12+ messages in thread From: Jerry Snitselaar @ 2020-02-07 9:34 UTC (permalink / raw) To: iommu, linux-kernel, Lu Baolu On Thu Feb 06 20, Jerry Snitselaar wrote: >On Tue Feb 04 20, Jerry Snitselaar wrote: >>I'm working on getting a system to reproduce this, and verify it also occurs >>with 5.5, but I have a report of a case where the kdump kernel gives >>warnings like the following on a hp dl360 gen9: >> >>[ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver >>[ 2.832615] ehci-pci: EHCI PCI platform driver >>[ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>[ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1 >>[ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>[ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>[ 2.840671] Modules linked in: >>[ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 >>[ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 >>[ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>[ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 >>[ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>[ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: 0000000000000000 >>[ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: ffff88ec7f1c8000 >>[ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: ffff88ec7cbfcd00 >>[ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: 0000000000000000 >>[ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: 00000000ffffffff >>[ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >>[ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>[ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: 00000000001606b0 >>[ 2.840671] Call Trace: >>[ 2.840671] __intel_map_single+0x62/0x140 >>[ 2.840671] intel_alloc_coherent+0xa6/0x130 >>[ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>[ 2.840671] e_qh_alloc+0x55/0x130 >>[ 2.840671] ehci_setup+0x284/0x7b0 >>[ 2.840671] ehci_pci_setup+0xa3/0x530 >>[ 2.840671] usb_add_hcd+0x2b6/0x800 >>[ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>[ 2.840671] local_pci_probe+0x41/0x90 >>[ 2.840671] pci_device_probe+0x105/0x1b0 >>[ 2.840671] driver_probe_device+0x12d/0x460 >>[ 2.840671] device_driver_attach+0x50/0x60 >>[ 2.840671] __driver_attach+0x61/0x130 >>[ 2.840671] ? device_driver_attach+0x60/0x60 >>[ 2.840671] bus_for_each_dev+0x77/0xc0 >>[ 2.840671] ? klist_add_tail+0x3b/0x70 >>[ 2.840671] bus_add_driver+0x14d/0x1e0 >>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>[ 2.840671] ? do_early_param+0x91/0x91 >>[ 2.840671] driver_register+0x6b/0xb0 >>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>[ 2.840671] do_one_initcall+0x46/0x1c3 >>[ 2.840671] ? do_early_param+0x91/0x91 >>[ 2.840671] kernel_init_freeable+0x1af/0x258 >>[ 2.840671] ? rest_init+0xaa/0xaa >>[ 2.840671] kernel_init+0xa/0xf9 >>[ 2.840671] ret_from_fork+0x35/0x40 >>[ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>[ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>[ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity mapping >>[ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported >>[ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>[ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>[ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 >>[ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 >>[ 3.035900] usb usb1: Product: EHCI Host Controller >>[ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>[ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >> >>It looks like the device finishes initializing once it figures out it >>needs dma mapping instead of the default >>passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>calls __intel_map_single, so I'm not sure why it is tripping over the >>WARN_ON in domain_get_iommu. >> >>one thing I noticed while looking at this is that domain_get_iommu can >>return NULL. So should there be something like the following in >>__intel_map_single after the domain_get_iommu call? >> >>if (!iommu) >> goto error; >> >>It is possible to deref the null pointer later otherwise. >> >>Regards, >>Jerry > >I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and it calls deferred_attach_domain, which sets the domain to the group domain, which in this case is the identity domain. Then it calls domain_get_iommu, which spits out the warning because the domain type was dma and returns null. My workaround was to add a call to iommu_need_mapping and find_domain after the deferred_attach_domain, but I don't know if that is the correct solution. There are a couple other spots like intel_map_sg that have the deferred_attach_domain after iommu_need_mapping that possibly will suffer from the same problem. diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b5c5ab58d395..063f45323cfc 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr, if (!domain) return DMA_MAPPING_ERROR; + if (!iommu_need_mapping(dev)) + return paddr; + + domain = find_domain(dev); iommu = domain_get_iommu(domain); size = aligned_nrpages(paddr, size); I finally got a git repo over to one of these systems, and was able to reproduce the issue with the head of linus's tree. With commit 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity domain") there are more of the warnings, because devices are using identity that weren't before. Regards, Jerry _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu 2020-02-07 9:34 ` Jerry Snitselaar @ 2020-02-08 6:53 ` Lu Baolu -1 siblings, 0 replies; 12+ messages in thread From: Lu Baolu @ 2020-02-08 6:53 UTC (permalink / raw) To: iommu, linux-kernel Hi Jerry, On 2020/2/7 17:34, Jerry Snitselaar wrote: > On Thu Feb 06 20, Jerry Snitselaar wrote: >> On Tue Feb 04 20, Jerry Snitselaar wrote: >>> I'm working on getting a system to reproduce this, and verify it also >>> occurs >>> with 5.5, but I have a report of a case where the kdump kernel gives >>> warnings like the following on a hp dl360 gen9: >>> >>> [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) >>> Driver >>> [ 2.832615] ehci-pci: EHCI PCI platform driver >>> [ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>> [ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, >>> assigned bus number 1 >>> [ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>> [ 2.839700] WARNING: CPU: 0 PID: 1 at >>> drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>> [ 2.840671] Modules linked in: >>> [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>> 4.18.0-170.el8.kdump2.x86_64 #1 >>> [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 >>> Gen9, BIOS P89 07/21/2019 >>> [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>> [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 >>> 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b >>> 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 >>> 0f b6 f6 >>> [ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>> [ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: >>> 0000000000000000 >>> [ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: >>> ffff88ec7f1c8000 >>> [ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: >>> ffff88ec7cbfcd00 >>> [ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: >>> 0000000000000000 >>> [ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: >>> 00000000ffffffff >>> [ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) >>> knlGS:0000000000000000 >>> [ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: >>> 00000000001606b0 >>> [ 2.840671] Call Trace: >>> [ 2.840671] __intel_map_single+0x62/0x140 >>> [ 2.840671] intel_alloc_coherent+0xa6/0x130 >>> [ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>> [ 2.840671] e_qh_alloc+0x55/0x130 >>> [ 2.840671] ehci_setup+0x284/0x7b0 >>> [ 2.840671] ehci_pci_setup+0xa3/0x530 >>> [ 2.840671] usb_add_hcd+0x2b6/0x800 >>> [ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>> [ 2.840671] local_pci_probe+0x41/0x90 >>> [ 2.840671] pci_device_probe+0x105/0x1b0 >>> [ 2.840671] driver_probe_device+0x12d/0x460 >>> [ 2.840671] device_driver_attach+0x50/0x60 >>> [ 2.840671] __driver_attach+0x61/0x130 >>> [ 2.840671] ? device_driver_attach+0x60/0x60 >>> [ 2.840671] bus_for_each_dev+0x77/0xc0 >>> [ 2.840671] ? klist_add_tail+0x3b/0x70 >>> [ 2.840671] bus_add_driver+0x14d/0x1e0 >>> [ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>> [ 2.840671] ? do_early_param+0x91/0x91 >>> [ 2.840671] driver_register+0x6b/0xb0 >>> [ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>> [ 2.840671] do_one_initcall+0x46/0x1c3 >>> [ 2.840671] ? do_early_param+0x91/0x91 >>> [ 2.840671] kernel_init_freeable+0x1af/0x258 >>> [ 2.840671] ? rest_init+0xaa/0xaa >>> [ 2.840671] kernel_init+0xa/0xf9 >>> [ 2.840671] ret_from_fork+0x35/0x40 >>> [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>> [ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>> [ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity >>> mapping >>> [ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not >>> supported >>> [ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>> [ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>> [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, >>> idProduct=0002, bcdDevice= 4.18 >>> [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, >>> SerialNumber=1 >>> [ 3.035900] usb usb1: Product: EHCI Host Controller >>> [ 3.037423] usb usb1: Manufacturer: Linux >>> 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>> [ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >>> >>> It looks like the device finishes initializing once it figures out it >>> needs dma mapping instead of the default >>> passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>> calls __intel_map_single, so I'm not sure why it is tripping over the >>> WARN_ON in domain_get_iommu. >>> >>> one thing I noticed while looking at this is that domain_get_iommu can >>> return NULL. So should there be something like the following in >>> __intel_map_single after the domain_get_iommu call? >>> >>> if (!iommu) >>> goto error; >>> >>> It is possible to deref the null pointer later otherwise. >>> >>> Regards, >>> Jerry >> >> I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. > > Hi Baolu, > > I think I understand what is happening here. With the kdump boot > translation is pre-enabled, so in intel_iommu_add_device things are > getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent > calls iommu_need_mapping it returns true, but doesn't do the dma > domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then > __intel_map_single gets called and it calls deferred_attach_domain, > which sets the domain to the group domain, which in this case is the > identity domain. Then it calls domain_get_iommu, which spits out the > warning because the domain type was dma and returns null. My > workaround was to add a call to iommu_need_mapping and find_domain > after the deferred_attach_domain, but I don't know if that is the > correct solution. There are a couple other spots like intel_map_sg > that have the deferred_attach_domain after iommu_need_mapping that > possibly will suffer from the same problem. > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index b5c5ab58d395..063f45323cfc 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct > device *dev, phys_addr_t paddr, > if (!domain) > return DMA_MAPPING_ERROR; > > + if (!iommu_need_mapping(dev)) > + return paddr; > + > + domain = find_domain(dev); > iommu = domain_get_iommu(domain); > size = aligned_nrpages(paddr, size); > > > I finally got a git repo over to one of these systems, and was > able to reproduce the issue with the head of linus's tree. With commit > 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity > domain") > there are more of the warnings, because devices are using identity that > weren't before. > Is it possible to move deferred domain attachment to identity_mapping()? diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9dc37672bf89..234ab346198e 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2913,13 +2913,11 @@ static int __init si_domain_init(int hw) static int identity_mapping(struct device *dev) { - struct device_domain_info *info; + struct dmar_domain *domain; - info = dev->archdata.iommu; - if (info && info != DUMMY_DEVICE_DOMAIN_INFO && info != DEFER_DEVICE_DOMAIN_INFO) - return (info->domain == si_domain); + domain = deferred_attach_domain(dev); - return 0; + return (!domain || domain_type_is_si(domain)); } static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev) Best regards, baolu ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu @ 2020-02-08 6:53 ` Lu Baolu 0 siblings, 0 replies; 12+ messages in thread From: Lu Baolu @ 2020-02-08 6:53 UTC (permalink / raw) To: iommu, linux-kernel Hi Jerry, On 2020/2/7 17:34, Jerry Snitselaar wrote: > On Thu Feb 06 20, Jerry Snitselaar wrote: >> On Tue Feb 04 20, Jerry Snitselaar wrote: >>> I'm working on getting a system to reproduce this, and verify it also >>> occurs >>> with 5.5, but I have a report of a case where the kdump kernel gives >>> warnings like the following on a hp dl360 gen9: >>> >>> [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) >>> Driver >>> [ 2.832615] ehci-pci: EHCI PCI platform driver >>> [ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>> [ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, >>> assigned bus number 1 >>> [ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>> [ 2.839700] WARNING: CPU: 0 PID: 1 at >>> drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>> [ 2.840671] Modules linked in: >>> [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>> 4.18.0-170.el8.kdump2.x86_64 #1 >>> [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 >>> Gen9, BIOS P89 07/21/2019 >>> [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>> [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 >>> 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b >>> 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 >>> 0f b6 f6 >>> [ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>> [ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: >>> 0000000000000000 >>> [ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: >>> ffff88ec7f1c8000 >>> [ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: >>> ffff88ec7cbfcd00 >>> [ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: >>> 0000000000000000 >>> [ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: >>> 00000000ffffffff >>> [ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) >>> knlGS:0000000000000000 >>> [ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: >>> 00000000001606b0 >>> [ 2.840671] Call Trace: >>> [ 2.840671] __intel_map_single+0x62/0x140 >>> [ 2.840671] intel_alloc_coherent+0xa6/0x130 >>> [ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>> [ 2.840671] e_qh_alloc+0x55/0x130 >>> [ 2.840671] ehci_setup+0x284/0x7b0 >>> [ 2.840671] ehci_pci_setup+0xa3/0x530 >>> [ 2.840671] usb_add_hcd+0x2b6/0x800 >>> [ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>> [ 2.840671] local_pci_probe+0x41/0x90 >>> [ 2.840671] pci_device_probe+0x105/0x1b0 >>> [ 2.840671] driver_probe_device+0x12d/0x460 >>> [ 2.840671] device_driver_attach+0x50/0x60 >>> [ 2.840671] __driver_attach+0x61/0x130 >>> [ 2.840671] ? device_driver_attach+0x60/0x60 >>> [ 2.840671] bus_for_each_dev+0x77/0xc0 >>> [ 2.840671] ? klist_add_tail+0x3b/0x70 >>> [ 2.840671] bus_add_driver+0x14d/0x1e0 >>> [ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>> [ 2.840671] ? do_early_param+0x91/0x91 >>> [ 2.840671] driver_register+0x6b/0xb0 >>> [ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>> [ 2.840671] do_one_initcall+0x46/0x1c3 >>> [ 2.840671] ? do_early_param+0x91/0x91 >>> [ 2.840671] kernel_init_freeable+0x1af/0x258 >>> [ 2.840671] ? rest_init+0xaa/0xaa >>> [ 2.840671] kernel_init+0xa/0xf9 >>> [ 2.840671] ret_from_fork+0x35/0x40 >>> [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>> [ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>> [ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity >>> mapping >>> [ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not >>> supported >>> [ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>> [ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>> [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, >>> idProduct=0002, bcdDevice= 4.18 >>> [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, >>> SerialNumber=1 >>> [ 3.035900] usb usb1: Product: EHCI Host Controller >>> [ 3.037423] usb usb1: Manufacturer: Linux >>> 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>> [ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >>> >>> It looks like the device finishes initializing once it figures out it >>> needs dma mapping instead of the default >>> passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>> calls __intel_map_single, so I'm not sure why it is tripping over the >>> WARN_ON in domain_get_iommu. >>> >>> one thing I noticed while looking at this is that domain_get_iommu can >>> return NULL. So should there be something like the following in >>> __intel_map_single after the domain_get_iommu call? >>> >>> if (!iommu) >>> goto error; >>> >>> It is possible to deref the null pointer later otherwise. >>> >>> Regards, >>> Jerry >> >> I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. > > Hi Baolu, > > I think I understand what is happening here. With the kdump boot > translation is pre-enabled, so in intel_iommu_add_device things are > getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent > calls iommu_need_mapping it returns true, but doesn't do the dma > domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then > __intel_map_single gets called and it calls deferred_attach_domain, > which sets the domain to the group domain, which in this case is the > identity domain. Then it calls domain_get_iommu, which spits out the > warning because the domain type was dma and returns null. My > workaround was to add a call to iommu_need_mapping and find_domain > after the deferred_attach_domain, but I don't know if that is the > correct solution. There are a couple other spots like intel_map_sg > that have the deferred_attach_domain after iommu_need_mapping that > possibly will suffer from the same problem. > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index b5c5ab58d395..063f45323cfc 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct > device *dev, phys_addr_t paddr, > if (!domain) > return DMA_MAPPING_ERROR; > > + if (!iommu_need_mapping(dev)) > + return paddr; > + > + domain = find_domain(dev); > iommu = domain_get_iommu(domain); > size = aligned_nrpages(paddr, size); > > > I finally got a git repo over to one of these systems, and was > able to reproduce the issue with the head of linus's tree. With commit > 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity > domain") > there are more of the warnings, because devices are using identity that > weren't before. > Is it possible to move deferred domain attachment to identity_mapping()? diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9dc37672bf89..234ab346198e 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2913,13 +2913,11 @@ static int __init si_domain_init(int hw) static int identity_mapping(struct device *dev) { - struct device_domain_info *info; + struct dmar_domain *domain; - info = dev->archdata.iommu; - if (info && info != DUMMY_DEVICE_DOMAIN_INFO && info != DEFER_DEVICE_DOMAIN_INFO) - return (info->domain == si_domain); + domain = deferred_attach_domain(dev); - return 0; + return (!domain || domain_type_is_si(domain)); } static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev) Best regards, baolu _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu 2020-02-08 6:53 ` Lu Baolu @ 2020-02-08 10:19 ` Jerry Snitselaar -1 siblings, 0 replies; 12+ messages in thread From: Jerry Snitselaar @ 2020-02-08 10:19 UTC (permalink / raw) To: Lu Baolu; +Cc: iommu, linux-kernel On Sat Feb 08 20, Lu Baolu wrote: >Hi Jerry, > >On 2020/2/7 17:34, Jerry Snitselaar wrote: >>On Thu Feb 06 20, Jerry Snitselaar wrote: >>>On Tue Feb 04 20, Jerry Snitselaar wrote: >>>>I'm working on getting a system to reproduce this, and verify it >>>>also occurs >>>>with 5.5, but I have a report of a case where the kdump kernel gives >>>>warnings like the following on a hp dl360 gen9: >>>> >>>>[ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller >>>>(EHCI) Driver >>>>[ 2.832615] ehci-pci: EHCI PCI platform driver >>>>[ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>>>[ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, >>>>assigned bus number 1 >>>>[ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>>>[ 2.839700] WARNING: CPU: 0 PID: 1 at >>>>drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>>>[ 2.840671] Modules linked in: >>>>[ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>>>4.18.0-170.el8.kdump2.x86_64 #1 >>>>[ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant >>>>DL360 Gen9, BIOS P89 07/21/2019 >>>>[ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>>>[ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 >>>>0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 >>>>91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 >>>>00 00 41 55 40 0f b6 f6 >>>>[ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>>>[ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: >>>>0000000000000000 >>>>[ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: >>>>ffff88ec7f1c8000 >>>>[ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: >>>>ffff88ec7cbfcd00 >>>>[ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: >>>>0000000000000000 >>>>[ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: >>>>00000000ffffffff >>>>[ 2.840671] FS: 0000000000000000(0000) >>>>GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >>>>[ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>[ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: >>>>00000000001606b0 >>>>[ 2.840671] Call Trace: >>>>[ 2.840671] __intel_map_single+0x62/0x140 >>>>[ 2.840671] intel_alloc_coherent+0xa6/0x130 >>>>[ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>>>[ 2.840671] e_qh_alloc+0x55/0x130 >>>>[ 2.840671] ehci_setup+0x284/0x7b0 >>>>[ 2.840671] ehci_pci_setup+0xa3/0x530 >>>>[ 2.840671] usb_add_hcd+0x2b6/0x800 >>>>[ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>>>[ 2.840671] local_pci_probe+0x41/0x90 >>>>[ 2.840671] pci_device_probe+0x105/0x1b0 >>>>[ 2.840671] driver_probe_device+0x12d/0x460 >>>>[ 2.840671] device_driver_attach+0x50/0x60 >>>>[ 2.840671] __driver_attach+0x61/0x130 >>>>[ 2.840671] ? device_driver_attach+0x60/0x60 >>>>[ 2.840671] bus_for_each_dev+0x77/0xc0 >>>>[ 2.840671] ? klist_add_tail+0x3b/0x70 >>>>[ 2.840671] bus_add_driver+0x14d/0x1e0 >>>>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>>>[ 2.840671] ? do_early_param+0x91/0x91 >>>>[ 2.840671] driver_register+0x6b/0xb0 >>>>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>>>[ 2.840671] do_one_initcall+0x46/0x1c3 >>>>[ 2.840671] ? do_early_param+0x91/0x91 >>>>[ 2.840671] kernel_init_freeable+0x1af/0x258 >>>>[ 2.840671] ? rest_init+0xaa/0xaa >>>>[ 2.840671] kernel_init+0xa/0xf9 >>>>[ 2.840671] ret_from_fork+0x35/0x40 >>>>[ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>>>[ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>>>[ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses >>>>non-identity mapping >>>>[ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is >>>>not supported >>>>[ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>>>[ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>>>[ 3.030918] usb usb1: New USB device found, idVendor=1d6b, >>>>idProduct=0002, bcdDevice= 4.18 >>>>[ 3.033491] usb usb1: New USB device strings: Mfr=3, >>>>Product=2, SerialNumber=1 >>>>[ 3.035900] usb usb1: Product: EHCI Host Controller >>>>[ 3.037423] usb usb1: Manufacturer: Linux >>>>4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>>>[ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >>>> >>>>It looks like the device finishes initializing once it figures out it >>>>needs dma mapping instead of the default >>>>passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>>>calls __intel_map_single, so I'm not sure why it is tripping over the >>>>WARN_ON in domain_get_iommu. >>>> >>>>one thing I noticed while looking at this is that domain_get_iommu can >>>>return NULL. So should there be something like the following in >>>>__intel_map_single after the domain_get_iommu call? >>>> >>>>if (!iommu) >>>> goto error; >>>> >>>>It is possible to deref the null pointer later otherwise. >>>> >>>>Regards, >>>>Jerry >>> >>>I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. >> >>Hi Baolu, >> >>I think I understand what is happening here. With the kdump boot >>translation is pre-enabled, so in intel_iommu_add_device things are >>getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent >>calls iommu_need_mapping it returns true, but doesn't do the dma >>domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then >>__intel_map_single gets called and it calls deferred_attach_domain, >>which sets the domain to the group domain, which in this case is the >>identity domain. Then it calls domain_get_iommu, which spits out the >>warning because the domain type was dma and returns null. My >>workaround was to add a call to iommu_need_mapping and find_domain >>after the deferred_attach_domain, but I don't know if that is the >>correct solution. There are a couple other spots like intel_map_sg >>that have the deferred_attach_domain after iommu_need_mapping that >>possibly will suffer from the same problem. >> >>diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>index b5c5ab58d395..063f45323cfc 100644 >>--- a/drivers/iommu/intel-iommu.c >>+++ b/drivers/iommu/intel-iommu.c >>@@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct >>device *dev, phys_addr_t paddr, >> if (!domain) >> return DMA_MAPPING_ERROR; >> >>+ if (!iommu_need_mapping(dev)) >>+ return paddr; >>+ >>+ domain = find_domain(dev); >> iommu = domain_get_iommu(domain); >> size = aligned_nrpages(paddr, size); >> >> >>I finally got a git repo over to one of these systems, and was >>able to reproduce the issue with the head of linus's tree. With commit >>9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity >>domain") >>there are more of the warnings, because devices are using identity that >>weren't before. >> > >Is it possible to move deferred domain attachment to identity_mapping()? > >diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >index 9dc37672bf89..234ab346198e 100644 >--- a/drivers/iommu/intel-iommu.c >+++ b/drivers/iommu/intel-iommu.c >@@ -2913,13 +2913,11 @@ static int __init si_domain_init(int hw) > > static int identity_mapping(struct device *dev) > { >- struct device_domain_info *info; >+ struct dmar_domain *domain; > >- info = dev->archdata.iommu; >- if (info && info != DUMMY_DEVICE_DOMAIN_INFO && info != >DEFER_DEVICE_DOMAIN_INFO) >- return (info->domain == si_domain); >+ domain = deferred_attach_domain(dev); > >- return 0; >+ return (!domain || domain_type_is_si(domain)); > } > > static int domain_add_dev_info(struct dmar_domain *domain, struct >device *dev) > >Best regards, >baolu Hi Baolu, I think that would work, and then change the deferred_attach_domain calls in __intel_map_single and intel_map_sg to find_domain? I did a quick test with it on the system where I've been looking at this. Regards, Jerry >_______________________________________________ >iommu mailing list >iommu@lists.linux-foundation.org >https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu @ 2020-02-08 10:19 ` Jerry Snitselaar 0 siblings, 0 replies; 12+ messages in thread From: Jerry Snitselaar @ 2020-02-08 10:19 UTC (permalink / raw) To: Lu Baolu; +Cc: iommu, linux-kernel On Sat Feb 08 20, Lu Baolu wrote: >Hi Jerry, > >On 2020/2/7 17:34, Jerry Snitselaar wrote: >>On Thu Feb 06 20, Jerry Snitselaar wrote: >>>On Tue Feb 04 20, Jerry Snitselaar wrote: >>>>I'm working on getting a system to reproduce this, and verify it >>>>also occurs >>>>with 5.5, but I have a report of a case where the kdump kernel gives >>>>warnings like the following on a hp dl360 gen9: >>>> >>>>[ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller >>>>(EHCI) Driver >>>>[ 2.832615] ehci-pci: EHCI PCI platform driver >>>>[ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>>>[ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, >>>>assigned bus number 1 >>>>[ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>>>[ 2.839700] WARNING: CPU: 0 PID: 1 at >>>>drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>>>[ 2.840671] Modules linked in: >>>>[ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>>>4.18.0-170.el8.kdump2.x86_64 #1 >>>>[ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant >>>>DL360 Gen9, BIOS P89 07/21/2019 >>>>[ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>>>[ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 >>>>0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 >>>>91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 >>>>00 00 41 55 40 0f b6 f6 >>>>[ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>>>[ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: >>>>0000000000000000 >>>>[ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: >>>>ffff88ec7f1c8000 >>>>[ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: >>>>ffff88ec7cbfcd00 >>>>[ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: >>>>0000000000000000 >>>>[ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: >>>>00000000ffffffff >>>>[ 2.840671] FS: 0000000000000000(0000) >>>>GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >>>>[ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>[ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: >>>>00000000001606b0 >>>>[ 2.840671] Call Trace: >>>>[ 2.840671] __intel_map_single+0x62/0x140 >>>>[ 2.840671] intel_alloc_coherent+0xa6/0x130 >>>>[ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>>>[ 2.840671] e_qh_alloc+0x55/0x130 >>>>[ 2.840671] ehci_setup+0x284/0x7b0 >>>>[ 2.840671] ehci_pci_setup+0xa3/0x530 >>>>[ 2.840671] usb_add_hcd+0x2b6/0x800 >>>>[ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>>>[ 2.840671] local_pci_probe+0x41/0x90 >>>>[ 2.840671] pci_device_probe+0x105/0x1b0 >>>>[ 2.840671] driver_probe_device+0x12d/0x460 >>>>[ 2.840671] device_driver_attach+0x50/0x60 >>>>[ 2.840671] __driver_attach+0x61/0x130 >>>>[ 2.840671] ? device_driver_attach+0x60/0x60 >>>>[ 2.840671] bus_for_each_dev+0x77/0xc0 >>>>[ 2.840671] ? klist_add_tail+0x3b/0x70 >>>>[ 2.840671] bus_add_driver+0x14d/0x1e0 >>>>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>>>[ 2.840671] ? do_early_param+0x91/0x91 >>>>[ 2.840671] driver_register+0x6b/0xb0 >>>>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>>>[ 2.840671] do_one_initcall+0x46/0x1c3 >>>>[ 2.840671] ? do_early_param+0x91/0x91 >>>>[ 2.840671] kernel_init_freeable+0x1af/0x258 >>>>[ 2.840671] ? rest_init+0xaa/0xaa >>>>[ 2.840671] kernel_init+0xa/0xf9 >>>>[ 2.840671] ret_from_fork+0x35/0x40 >>>>[ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>>>[ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>>>[ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses >>>>non-identity mapping >>>>[ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is >>>>not supported >>>>[ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>>>[ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>>>[ 3.030918] usb usb1: New USB device found, idVendor=1d6b, >>>>idProduct=0002, bcdDevice= 4.18 >>>>[ 3.033491] usb usb1: New USB device strings: Mfr=3, >>>>Product=2, SerialNumber=1 >>>>[ 3.035900] usb usb1: Product: EHCI Host Controller >>>>[ 3.037423] usb usb1: Manufacturer: Linux >>>>4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>>>[ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >>>> >>>>It looks like the device finishes initializing once it figures out it >>>>needs dma mapping instead of the default >>>>passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>>>calls __intel_map_single, so I'm not sure why it is tripping over the >>>>WARN_ON in domain_get_iommu. >>>> >>>>one thing I noticed while looking at this is that domain_get_iommu can >>>>return NULL. So should there be something like the following in >>>>__intel_map_single after the domain_get_iommu call? >>>> >>>>if (!iommu) >>>> goto error; >>>> >>>>It is possible to deref the null pointer later otherwise. >>>> >>>>Regards, >>>>Jerry >>> >>>I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. >> >>Hi Baolu, >> >>I think I understand what is happening here. With the kdump boot >>translation is pre-enabled, so in intel_iommu_add_device things are >>getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent >>calls iommu_need_mapping it returns true, but doesn't do the dma >>domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then >>__intel_map_single gets called and it calls deferred_attach_domain, >>which sets the domain to the group domain, which in this case is the >>identity domain. Then it calls domain_get_iommu, which spits out the >>warning because the domain type was dma and returns null. My >>workaround was to add a call to iommu_need_mapping and find_domain >>after the deferred_attach_domain, but I don't know if that is the >>correct solution. There are a couple other spots like intel_map_sg >>that have the deferred_attach_domain after iommu_need_mapping that >>possibly will suffer from the same problem. >> >>diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>index b5c5ab58d395..063f45323cfc 100644 >>--- a/drivers/iommu/intel-iommu.c >>+++ b/drivers/iommu/intel-iommu.c >>@@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct >>device *dev, phys_addr_t paddr, >> if (!domain) >> return DMA_MAPPING_ERROR; >> >>+ if (!iommu_need_mapping(dev)) >>+ return paddr; >>+ >>+ domain = find_domain(dev); >> iommu = domain_get_iommu(domain); >> size = aligned_nrpages(paddr, size); >> >> >>I finally got a git repo over to one of these systems, and was >>able to reproduce the issue with the head of linus's tree. With commit >>9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity >>domain") >>there are more of the warnings, because devices are using identity that >>weren't before. >> > >Is it possible to move deferred domain attachment to identity_mapping()? > >diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >index 9dc37672bf89..234ab346198e 100644 >--- a/drivers/iommu/intel-iommu.c >+++ b/drivers/iommu/intel-iommu.c >@@ -2913,13 +2913,11 @@ static int __init si_domain_init(int hw) > > static int identity_mapping(struct device *dev) > { >- struct device_domain_info *info; >+ struct dmar_domain *domain; > >- info = dev->archdata.iommu; >- if (info && info != DUMMY_DEVICE_DOMAIN_INFO && info != >DEFER_DEVICE_DOMAIN_INFO) >- return (info->domain == si_domain); >+ domain = deferred_attach_domain(dev); > >- return 0; >+ return (!domain || domain_type_is_si(domain)); > } > > static int domain_add_dev_info(struct dmar_domain *domain, struct >device *dev) > >Best regards, >baolu Hi Baolu, I think that would work, and then change the deferred_attach_domain calls in __intel_map_single and intel_map_sg to find_domain? I did a quick test with it on the system where I've been looking at this. Regards, Jerry >_______________________________________________ >iommu mailing list >iommu@lists.linux-foundation.org >https://lists.linuxfoundation.org/mailman/listinfo/iommu _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu 2020-02-08 10:19 ` Jerry Snitselaar @ 2020-02-09 5:54 ` Lu Baolu -1 siblings, 0 replies; 12+ messages in thread From: Lu Baolu @ 2020-02-09 5:54 UTC (permalink / raw) To: iommu, linux-kernel Hi, On 2020/2/8 18:19, Jerry Snitselaar wrote: > On Sat Feb 08 20, Lu Baolu wrote: >> Hi Jerry, >> >> On 2020/2/7 17:34, Jerry Snitselaar wrote: >>> On Thu Feb 06 20, Jerry Snitselaar wrote: >>>> On Tue Feb 04 20, Jerry Snitselaar wrote: >>>>> I'm working on getting a system to reproduce this, and verify it >>>>> also occurs >>>>> with 5.5, but I have a report of a case where the kdump kernel gives >>>>> warnings like the following on a hp dl360 gen9: >>>>> >>>>> [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) >>>>> Driver >>>>> [ 2.832615] ehci-pci: EHCI PCI platform driver >>>>> [ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>>>> [ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, >>>>> assigned bus number 1 >>>>> [ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>>>> [ 2.839700] WARNING: CPU: 0 PID: 1 at >>>>> drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>>>> [ 2.840671] Modules linked in: >>>>> [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>>>> 4.18.0-170.el8.kdump2.x86_64 #1 >>>>> [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 >>>>> Gen9, BIOS P89 07/21/2019 >>>>> [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>>>> [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b >>>>> 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 >>>>> 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 >>>>> 55 40 0f b6 f6 >>>>> [ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>>>> [ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: >>>>> 0000000000000000 >>>>> [ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: >>>>> ffff88ec7f1c8000 >>>>> [ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: >>>>> ffff88ec7cbfcd00 >>>>> [ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: >>>>> 0000000000000000 >>>>> [ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: >>>>> 00000000ffffffff >>>>> [ 2.840671] FS: 0000000000000000(0000) >>>>> GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >>>>> [ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> [ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: >>>>> 00000000001606b0 >>>>> [ 2.840671] Call Trace: >>>>> [ 2.840671] __intel_map_single+0x62/0x140 >>>>> [ 2.840671] intel_alloc_coherent+0xa6/0x130 >>>>> [ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>>>> [ 2.840671] e_qh_alloc+0x55/0x130 >>>>> [ 2.840671] ehci_setup+0x284/0x7b0 >>>>> [ 2.840671] ehci_pci_setup+0xa3/0x530 >>>>> [ 2.840671] usb_add_hcd+0x2b6/0x800 >>>>> [ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>>>> [ 2.840671] local_pci_probe+0x41/0x90 >>>>> [ 2.840671] pci_device_probe+0x105/0x1b0 >>>>> [ 2.840671] driver_probe_device+0x12d/0x460 >>>>> [ 2.840671] device_driver_attach+0x50/0x60 >>>>> [ 2.840671] __driver_attach+0x61/0x130 >>>>> [ 2.840671] ? device_driver_attach+0x60/0x60 >>>>> [ 2.840671] bus_for_each_dev+0x77/0xc0 >>>>> [ 2.840671] ? klist_add_tail+0x3b/0x70 >>>>> [ 2.840671] bus_add_driver+0x14d/0x1e0 >>>>> [ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>>>> [ 2.840671] ? do_early_param+0x91/0x91 >>>>> [ 2.840671] driver_register+0x6b/0xb0 >>>>> [ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>>>> [ 2.840671] do_one_initcall+0x46/0x1c3 >>>>> [ 2.840671] ? do_early_param+0x91/0x91 >>>>> [ 2.840671] kernel_init_freeable+0x1af/0x258 >>>>> [ 2.840671] ? rest_init+0xaa/0xaa >>>>> [ 2.840671] kernel_init+0xa/0xf9 >>>>> [ 2.840671] ret_from_fork+0x35/0x40 >>>>> [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>>>> [ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>>>> [ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity >>>>> mapping >>>>> [ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not >>>>> supported >>>>> [ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>>>> [ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>>>> [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, >>>>> idProduct=0002, bcdDevice= 4.18 >>>>> [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, >>>>> SerialNumber=1 >>>>> [ 3.035900] usb usb1: Product: EHCI Host Controller >>>>> [ 3.037423] usb usb1: Manufacturer: Linux >>>>> 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>>>> [ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >>>>> >>>>> It looks like the device finishes initializing once it figures out it >>>>> needs dma mapping instead of the default >>>>> passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>>>> calls __intel_map_single, so I'm not sure why it is tripping over the >>>>> WARN_ON in domain_get_iommu. >>>>> >>>>> one thing I noticed while looking at this is that domain_get_iommu can >>>>> return NULL. So should there be something like the following in >>>>> __intel_map_single after the domain_get_iommu call? >>>>> >>>>> if (!iommu) >>>>> goto error; >>>>> >>>>> It is possible to deref the null pointer later otherwise. >>>>> >>>>> Regards, >>>>> Jerry >>>> >>>> I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. >>> >>> Hi Baolu, >>> >>> I think I understand what is happening here. With the kdump boot >>> translation is pre-enabled, so in intel_iommu_add_device things are >>> getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent >>> calls iommu_need_mapping it returns true, but doesn't do the dma >>> domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then >>> __intel_map_single gets called and it calls deferred_attach_domain, >>> which sets the domain to the group domain, which in this case is the >>> identity domain. Then it calls domain_get_iommu, which spits out the >>> warning because the domain type was dma and returns null. My >>> workaround was to add a call to iommu_need_mapping and find_domain >>> after the deferred_attach_domain, but I don't know if that is the >>> correct solution. There are a couple other spots like intel_map_sg >>> that have the deferred_attach_domain after iommu_need_mapping that >>> possibly will suffer from the same problem. >>> >>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>> index b5c5ab58d395..063f45323cfc 100644 >>> --- a/drivers/iommu/intel-iommu.c >>> +++ b/drivers/iommu/intel-iommu.c >>> @@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct >>> device *dev, phys_addr_t paddr, >>> if (!domain) >>> return DMA_MAPPING_ERROR; >>> >>> + if (!iommu_need_mapping(dev)) >>> + return paddr; >>> + >>> + domain = find_domain(dev); >>> iommu = domain_get_iommu(domain); >>> size = aligned_nrpages(paddr, size); >>> >>> >>> I finally got a git repo over to one of these systems, and was >>> able to reproduce the issue with the head of linus's tree. With commit >>> 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity >>> domain") >>> there are more of the warnings, because devices are using identity that >>> weren't before. >>> >> >> Is it possible to move deferred domain attachment to identity_mapping()? >> >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> index 9dc37672bf89..234ab346198e 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -2913,13 +2913,11 @@ static int __init si_domain_init(int hw) >> >> static int identity_mapping(struct device *dev) >> { >> - struct device_domain_info *info; >> + struct dmar_domain *domain; >> >> - info = dev->archdata.iommu; >> - if (info && info != DUMMY_DEVICE_DOMAIN_INFO && info != >> DEFER_DEVICE_DOMAIN_INFO) >> - return (info->domain == si_domain); >> + domain = deferred_attach_domain(dev); >> >> - return 0; >> + return (!domain || domain_type_is_si(domain)); >> } >> >> static int domain_add_dev_info(struct dmar_domain *domain, struct >> device *dev) >> >> Best regards, >> baolu > > Hi Baolu, > > I think that would work, and then change the deferred_attach_domain > calls in __intel_map_single and intel_map_sg to find_domain? > Yes. > I did a quick test with it on the system where I've been looking at this. > Thanks! Best regards, baolu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: warning from domain_get_iommu @ 2020-02-09 5:54 ` Lu Baolu 0 siblings, 0 replies; 12+ messages in thread From: Lu Baolu @ 2020-02-09 5:54 UTC (permalink / raw) To: iommu, linux-kernel Hi, On 2020/2/8 18:19, Jerry Snitselaar wrote: > On Sat Feb 08 20, Lu Baolu wrote: >> Hi Jerry, >> >> On 2020/2/7 17:34, Jerry Snitselaar wrote: >>> On Thu Feb 06 20, Jerry Snitselaar wrote: >>>> On Tue Feb 04 20, Jerry Snitselaar wrote: >>>>> I'm working on getting a system to reproduce this, and verify it >>>>> also occurs >>>>> with 5.5, but I have a report of a case where the kdump kernel gives >>>>> warnings like the following on a hp dl360 gen9: >>>>> >>>>> [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) >>>>> Driver >>>>> [ 2.832615] ehci-pci: EHCI PCI platform driver >>>>> [ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>>>> [ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, >>>>> assigned bus number 1 >>>>> [ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>>>> [ 2.839700] WARNING: CPU: 0 PID: 1 at >>>>> drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>>>> [ 2.840671] Modules linked in: >>>>> [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>>>> 4.18.0-170.el8.kdump2.x86_64 #1 >>>>> [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 >>>>> Gen9, BIOS P89 07/21/2019 >>>>> [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>>>> [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b >>>>> 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 >>>>> 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 >>>>> 55 40 0f b6 f6 >>>>> [ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>>>> [ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: >>>>> 0000000000000000 >>>>> [ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: >>>>> ffff88ec7f1c8000 >>>>> [ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: >>>>> ffff88ec7cbfcd00 >>>>> [ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: >>>>> 0000000000000000 >>>>> [ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: >>>>> 00000000ffffffff >>>>> [ 2.840671] FS: 0000000000000000(0000) >>>>> GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >>>>> [ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> [ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: >>>>> 00000000001606b0 >>>>> [ 2.840671] Call Trace: >>>>> [ 2.840671] __intel_map_single+0x62/0x140 >>>>> [ 2.840671] intel_alloc_coherent+0xa6/0x130 >>>>> [ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>>>> [ 2.840671] e_qh_alloc+0x55/0x130 >>>>> [ 2.840671] ehci_setup+0x284/0x7b0 >>>>> [ 2.840671] ehci_pci_setup+0xa3/0x530 >>>>> [ 2.840671] usb_add_hcd+0x2b6/0x800 >>>>> [ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>>>> [ 2.840671] local_pci_probe+0x41/0x90 >>>>> [ 2.840671] pci_device_probe+0x105/0x1b0 >>>>> [ 2.840671] driver_probe_device+0x12d/0x460 >>>>> [ 2.840671] device_driver_attach+0x50/0x60 >>>>> [ 2.840671] __driver_attach+0x61/0x130 >>>>> [ 2.840671] ? device_driver_attach+0x60/0x60 >>>>> [ 2.840671] bus_for_each_dev+0x77/0xc0 >>>>> [ 2.840671] ? klist_add_tail+0x3b/0x70 >>>>> [ 2.840671] bus_add_driver+0x14d/0x1e0 >>>>> [ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>>>> [ 2.840671] ? do_early_param+0x91/0x91 >>>>> [ 2.840671] driver_register+0x6b/0xb0 >>>>> [ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>>>> [ 2.840671] do_one_initcall+0x46/0x1c3 >>>>> [ 2.840671] ? do_early_param+0x91/0x91 >>>>> [ 2.840671] kernel_init_freeable+0x1af/0x258 >>>>> [ 2.840671] ? rest_init+0xaa/0xaa >>>>> [ 2.840671] kernel_init+0xa/0xf9 >>>>> [ 2.840671] ret_from_fork+0x35/0x40 >>>>> [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>>>> [ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>>>> [ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity >>>>> mapping >>>>> [ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not >>>>> supported >>>>> [ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>>>> [ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>>>> [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, >>>>> idProduct=0002, bcdDevice= 4.18 >>>>> [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, >>>>> SerialNumber=1 >>>>> [ 3.035900] usb usb1: Product: EHCI Host Controller >>>>> [ 3.037423] usb usb1: Manufacturer: Linux >>>>> 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>>>> [ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >>>>> >>>>> It looks like the device finishes initializing once it figures out it >>>>> needs dma mapping instead of the default >>>>> passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>>>> calls __intel_map_single, so I'm not sure why it is tripping over the >>>>> WARN_ON in domain_get_iommu. >>>>> >>>>> one thing I noticed while looking at this is that domain_get_iommu can >>>>> return NULL. So should there be something like the following in >>>>> __intel_map_single after the domain_get_iommu call? >>>>> >>>>> if (!iommu) >>>>> goto error; >>>>> >>>>> It is possible to deref the null pointer later otherwise. >>>>> >>>>> Regards, >>>>> Jerry >>>> >>>> I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. >>> >>> Hi Baolu, >>> >>> I think I understand what is happening here. With the kdump boot >>> translation is pre-enabled, so in intel_iommu_add_device things are >>> getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent >>> calls iommu_need_mapping it returns true, but doesn't do the dma >>> domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then >>> __intel_map_single gets called and it calls deferred_attach_domain, >>> which sets the domain to the group domain, which in this case is the >>> identity domain. Then it calls domain_get_iommu, which spits out the >>> warning because the domain type was dma and returns null. My >>> workaround was to add a call to iommu_need_mapping and find_domain >>> after the deferred_attach_domain, but I don't know if that is the >>> correct solution. There are a couple other spots like intel_map_sg >>> that have the deferred_attach_domain after iommu_need_mapping that >>> possibly will suffer from the same problem. >>> >>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>> index b5c5ab58d395..063f45323cfc 100644 >>> --- a/drivers/iommu/intel-iommu.c >>> +++ b/drivers/iommu/intel-iommu.c >>> @@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct >>> device *dev, phys_addr_t paddr, >>> if (!domain) >>> return DMA_MAPPING_ERROR; >>> >>> + if (!iommu_need_mapping(dev)) >>> + return paddr; >>> + >>> + domain = find_domain(dev); >>> iommu = domain_get_iommu(domain); >>> size = aligned_nrpages(paddr, size); >>> >>> >>> I finally got a git repo over to one of these systems, and was >>> able to reproduce the issue with the head of linus's tree. With commit >>> 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity >>> domain") >>> there are more of the warnings, because devices are using identity that >>> weren't before. >>> >> >> Is it possible to move deferred domain attachment to identity_mapping()? >> >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> index 9dc37672bf89..234ab346198e 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -2913,13 +2913,11 @@ static int __init si_domain_init(int hw) >> >> static int identity_mapping(struct device *dev) >> { >> - struct device_domain_info *info; >> + struct dmar_domain *domain; >> >> - info = dev->archdata.iommu; >> - if (info && info != DUMMY_DEVICE_DOMAIN_INFO && info != >> DEFER_DEVICE_DOMAIN_INFO) >> - return (info->domain == si_domain); >> + domain = deferred_attach_domain(dev); >> >> - return 0; >> + return (!domain || domain_type_is_si(domain)); >> } >> >> static int domain_add_dev_info(struct dmar_domain *domain, struct >> device *dev) >> >> Best regards, >> baolu > > Hi Baolu, > > I think that would work, and then change the deferred_attach_domain > calls in __intel_map_single and intel_map_sg to find_domain? > Yes. > I did a quick test with it on the system where I've been looking at this. > Thanks! Best regards, baolu _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-02-09 5:54 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-04 20:07 warning from domain_get_iommu Jerry Snitselaar 2020-02-04 20:07 ` Jerry Snitselaar 2020-02-06 17:43 ` Jerry Snitselaar 2020-02-06 17:43 ` Jerry Snitselaar 2020-02-07 9:34 ` Jerry Snitselaar 2020-02-07 9:34 ` Jerry Snitselaar 2020-02-08 6:53 ` Lu Baolu 2020-02-08 6:53 ` Lu Baolu 2020-02-08 10:19 ` Jerry Snitselaar 2020-02-08 10:19 ` Jerry Snitselaar 2020-02-09 5:54 ` Lu Baolu 2020-02-09 5:54 ` Lu Baolu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.