From: Jerry Snitselaar <jsnitsel@redhat.com> To: iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu <baolu.lu@linux.intel.com> Subject: Re: warning from domain_get_iommu Date: Fri, 7 Feb 2020 02:34:13 -0700 [thread overview] Message-ID: <20200207093413.oy4tclbrb3vqs3vz@cantor> (raw) In-Reply-To: <20200206174358.shzhieijle5wdshr@cantor> On Thu Feb 06 20, Jerry Snitselaar wrote: >On Tue Feb 04 20, Jerry Snitselaar wrote: >>I'm working on getting a system to reproduce this, and verify it also occurs >>with 5.5, but I have a report of a case where the kdump kernel gives >>warnings like the following on a hp dl360 gen9: >> >>[ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver >>[ 2.832615] ehci-pci: EHCI PCI platform driver >>[ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>[ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1 >>[ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>[ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>[ 2.840671] Modules linked in: >>[ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 >>[ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 >>[ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>[ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 >>[ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>[ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: 0000000000000000 >>[ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: ffff88ec7f1c8000 >>[ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: ffff88ec7cbfcd00 >>[ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: 0000000000000000 >>[ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: 00000000ffffffff >>[ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >>[ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>[ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: 00000000001606b0 >>[ 2.840671] Call Trace: >>[ 2.840671] __intel_map_single+0x62/0x140 >>[ 2.840671] intel_alloc_coherent+0xa6/0x130 >>[ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>[ 2.840671] e_qh_alloc+0x55/0x130 >>[ 2.840671] ehci_setup+0x284/0x7b0 >>[ 2.840671] ehci_pci_setup+0xa3/0x530 >>[ 2.840671] usb_add_hcd+0x2b6/0x800 >>[ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>[ 2.840671] local_pci_probe+0x41/0x90 >>[ 2.840671] pci_device_probe+0x105/0x1b0 >>[ 2.840671] driver_probe_device+0x12d/0x460 >>[ 2.840671] device_driver_attach+0x50/0x60 >>[ 2.840671] __driver_attach+0x61/0x130 >>[ 2.840671] ? device_driver_attach+0x60/0x60 >>[ 2.840671] bus_for_each_dev+0x77/0xc0 >>[ 2.840671] ? klist_add_tail+0x3b/0x70 >>[ 2.840671] bus_add_driver+0x14d/0x1e0 >>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>[ 2.840671] ? do_early_param+0x91/0x91 >>[ 2.840671] driver_register+0x6b/0xb0 >>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>[ 2.840671] do_one_initcall+0x46/0x1c3 >>[ 2.840671] ? do_early_param+0x91/0x91 >>[ 2.840671] kernel_init_freeable+0x1af/0x258 >>[ 2.840671] ? rest_init+0xaa/0xaa >>[ 2.840671] kernel_init+0xa/0xf9 >>[ 2.840671] ret_from_fork+0x35/0x40 >>[ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>[ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>[ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity mapping >>[ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported >>[ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>[ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>[ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 >>[ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 >>[ 3.035900] usb usb1: Product: EHCI Host Controller >>[ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>[ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >> >>It looks like the device finishes initializing once it figures out it >>needs dma mapping instead of the default >>passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>calls __intel_map_single, so I'm not sure why it is tripping over the >>WARN_ON in domain_get_iommu. >> >>one thing I noticed while looking at this is that domain_get_iommu can >>return NULL. So should there be something like the following in >>__intel_map_single after the domain_get_iommu call? >> >>if (!iommu) >> goto error; >> >>It is possible to deref the null pointer later otherwise. >> >>Regards, >>Jerry > >I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and it calls deferred_attach_domain, which sets the domain to the group domain, which in this case is the identity domain. Then it calls domain_get_iommu, which spits out the warning because the domain type was dma and returns null. My workaround was to add a call to iommu_need_mapping and find_domain after the deferred_attach_domain, but I don't know if that is the correct solution. There are a couple other spots like intel_map_sg that have the deferred_attach_domain after iommu_need_mapping that possibly will suffer from the same problem. diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b5c5ab58d395..063f45323cfc 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr, if (!domain) return DMA_MAPPING_ERROR; + if (!iommu_need_mapping(dev)) + return paddr; + + domain = find_domain(dev); iommu = domain_get_iommu(domain); size = aligned_nrpages(paddr, size); I finally got a git repo over to one of these systems, and was able to reproduce the issue with the head of linus's tree. With commit 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity domain") there are more of the warnings, because devices are using identity that weren't before. Regards, Jerry
WARNING: multiple messages have this Message-ID (diff)
From: Jerry Snitselaar <jsnitsel@redhat.com> To: iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu <baolu.lu@linux.intel.com> Subject: Re: warning from domain_get_iommu Date: Fri, 7 Feb 2020 02:34:13 -0700 [thread overview] Message-ID: <20200207093413.oy4tclbrb3vqs3vz@cantor> (raw) In-Reply-To: <20200206174358.shzhieijle5wdshr@cantor> On Thu Feb 06 20, Jerry Snitselaar wrote: >On Tue Feb 04 20, Jerry Snitselaar wrote: >>I'm working on getting a system to reproduce this, and verify it also occurs >>with 5.5, but I have a report of a case where the kdump kernel gives >>warnings like the following on a hp dl360 gen9: >> >>[ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver >>[ 2.832615] ehci-pci: EHCI PCI platform driver >>[ 2.834190] ehci-pci 0000:00:1a.0: EHCI Host Controller >>[ 2.835974] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1 >>[ 2.838276] ehci-pci 0000:00:1a.0: debug port 2 >>[ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 >>[ 2.840671] Modules linked in: >>[ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 >>[ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 >>[ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 >>[ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 >>[ 2.840671] RSP: 0018:ffffc900000dfab8 EFLAGS: 00010202 >>[ 2.840671] RAX: ffff88ec7f1c8000 RBX: 0000006c7c867000 RCX: 0000000000000000 >>[ 2.840671] RDX: 00000000fffffff0 RSI: 0000000000000000 RDI: ffff88ec7f1c8000 >>[ 2.840671] RBP: ffff88ec6f7000b0 R08: ffff88ec7f19d000 R09: ffff88ec7cbfcd00 >>[ 2.840671] R10: 0000000000000095 R11: ffffc900000df928 R12: 0000000000000000 >>[ 2.840671] R13: ffff88ec7f1c8000 R14: 0000000000001000 R15: 00000000ffffffff >>[ 2.840671] FS: 0000000000000000(0000) GS:ffff88ec7f600000(0000) knlGS:0000000000000000 >>[ 2.840671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>[ 2.840671] CR2: 00007ff3e1713000 CR3: 0000006c7de0a004 CR4: 00000000001606b0 >>[ 2.840671] Call Trace: >>[ 2.840671] __intel_map_single+0x62/0x140 >>[ 2.840671] intel_alloc_coherent+0xa6/0x130 >>[ 2.840671] dma_pool_alloc+0xd8/0x1e0 >>[ 2.840671] e_qh_alloc+0x55/0x130 >>[ 2.840671] ehci_setup+0x284/0x7b0 >>[ 2.840671] ehci_pci_setup+0xa3/0x530 >>[ 2.840671] usb_add_hcd+0x2b6/0x800 >>[ 2.840671] usb_hcd_pci_probe+0x375/0x460 >>[ 2.840671] local_pci_probe+0x41/0x90 >>[ 2.840671] pci_device_probe+0x105/0x1b0 >>[ 2.840671] driver_probe_device+0x12d/0x460 >>[ 2.840671] device_driver_attach+0x50/0x60 >>[ 2.840671] __driver_attach+0x61/0x130 >>[ 2.840671] ? device_driver_attach+0x60/0x60 >>[ 2.840671] bus_for_each_dev+0x77/0xc0 >>[ 2.840671] ? klist_add_tail+0x3b/0x70 >>[ 2.840671] bus_add_driver+0x14d/0x1e0 >>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>[ 2.840671] ? do_early_param+0x91/0x91 >>[ 2.840671] driver_register+0x6b/0xb0 >>[ 2.840671] ? ehci_hcd_init+0xaa/0xaa >>[ 2.840671] do_one_initcall+0x46/0x1c3 >>[ 2.840671] ? do_early_param+0x91/0x91 >>[ 2.840671] kernel_init_freeable+0x1af/0x258 >>[ 2.840671] ? rest_init+0xaa/0xaa >>[ 2.840671] kernel_init+0xa/0xf9 >>[ 2.840671] ret_from_fork+0x35/0x40 >>[ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- >>[ 3.010848] ehci-pci 0000:00:1a.0: Using iommu dma mapping >>[ 3.012551] ehci-pci 0000:00:1a.0: 32bit DMA uses non-identity mapping >>[ 3.018537] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported >>[ 3.021188] ehci-pci 0000:00:1a.0: irq 18, io mem 0x93002000 >>[ 3.029006] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00 >>[ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 >>[ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 >>[ 3.035900] usb usb1: Product: EHCI Host Controller >>[ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd >>[ 3.039691] usb usb1: SerialNumber: 0000:00:1a.0 >> >>It looks like the device finishes initializing once it figures out it >>needs dma mapping instead of the default >>passthrough. intel_alloc_coherent calls iommu_need_mapping, before it >>calls __intel_map_single, so I'm not sure why it is tripping over the >>WARN_ON in domain_get_iommu. >> >>one thing I noticed while looking at this is that domain_get_iommu can >>return NULL. So should there be something like the following in >>__intel_map_single after the domain_get_iommu call? >> >>if (!iommu) >> goto error; >> >>It is possible to deref the null pointer later otherwise. >> >>Regards, >>Jerry > >I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and it calls deferred_attach_domain, which sets the domain to the group domain, which in this case is the identity domain. Then it calls domain_get_iommu, which spits out the warning because the domain type was dma and returns null. My workaround was to add a call to iommu_need_mapping and find_domain after the deferred_attach_domain, but I don't know if that is the correct solution. There are a couple other spots like intel_map_sg that have the deferred_attach_domain after iommu_need_mapping that possibly will suffer from the same problem. diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b5c5ab58d395..063f45323cfc 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3515,6 +3515,10 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr, if (!domain) return DMA_MAPPING_ERROR; + if (!iommu_need_mapping(dev)) + return paddr; + + domain = find_domain(dev); iommu = domain_get_iommu(domain); size = aligned_nrpages(paddr, size); I finally got a git repo over to one of these systems, and was able to reproduce the issue with the head of linus's tree. With commit 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity domain") there are more of the warnings, because devices are using identity that weren't before. Regards, Jerry _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2020-02-07 9:34 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-02-04 20:07 warning from domain_get_iommu Jerry Snitselaar 2020-02-04 20:07 ` Jerry Snitselaar 2020-02-06 17:43 ` Jerry Snitselaar 2020-02-06 17:43 ` Jerry Snitselaar 2020-02-07 9:34 ` Jerry Snitselaar [this message] 2020-02-07 9:34 ` Jerry Snitselaar 2020-02-08 6:53 ` Lu Baolu 2020-02-08 6:53 ` Lu Baolu 2020-02-08 10:19 ` Jerry Snitselaar 2020-02-08 10:19 ` Jerry Snitselaar 2020-02-09 5:54 ` Lu Baolu 2020-02-09 5:54 ` Lu Baolu
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200207093413.oy4tclbrb3vqs3vz@cantor \ --to=jsnitsel@redhat.com \ --cc=baolu.lu@linux.intel.com \ --cc=iommu@lists.linux-foundation.org \ --cc=linux-kernel@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.