* [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed @ 2020-02-19 3:21 Daniel Drake 2020-02-19 3:40 ` Lu Baolu 0 siblings, 1 reply; 8+ messages in thread From: Daniel Drake @ 2020-02-19 3:21 UTC (permalink / raw) To: dwmw2, baolu.lu, joro; +Cc: bhelgaas, iommu, linux, jonathan.derrick From: Jon Derrick <jonathan.derrick@intel.com> The PCI devices handled by intel-iommu may have a DMA requester on another bus, such as VMD subdevices needing to use the VMD endpoint. The real DMA device is now used for the DMA mapping, but one case was missed earlier: if the VMD device (and hence subdevices too) are under IOMMU_DOMAIN_IDENTITY, mappings do not work. Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by creating an iommu DMA mapping, and fall back on dma_direct_map_page() for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY case is broken when intel_page_page() handles a subdevice. We observe that at iommu attach time, dmar_insert_one_dev_info() for the subdevices will never set dev->archdata.iommu. This is because that function uses find_domain() to check if there is already an IOMMU for the device, and find_domain() then defers to the real DMA device which does have one. Thus dmar_insert_one_dev_info() returns without assigning dev->archdata.iommu. Then, later: 1. intel_map_page() checks if an IOMMU mapping is needed by calling iommu_need_mapping() on the subdevice. identity_mapping() returns false because dev->archdata.iommu is NULL, so this function returns false indicating that mapping is needed. 2. __intel_map_single() is called to create the mapping. 3. __intel_map_single() calls find_domain(). This function now returns the IDENTITY domain corresponding to the real DMA device. 4. __intel_map_single() calls domain_get_iommu() on this "real" domain. A failure is hit and the entire operation is aborted, because this codepath is not intended to handle IDENTITY mappings: if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA)) return NULL; Fix this by using the real DMA device when checking if a mapping is needed. The IDENTITY case will then directly fall back on dma_direct_map_page(). The subdevice DMA mask is still considered in order to handle any situations where (e.g.) the subdevice only supports 32-bit DMA with the real DMA requester having a 64-bit DMA mask. Reported-by: Daniel Drake <drake@endlessm.com> Fixes: b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Daniel Drake <drake@endlessm.com> --- Notes: v2: switch to Jon's approach instead. v3: shortcut mask check in non-identity case v4: amend commit msg to explain why subdevice DMA mask is still considered This problem was originally detected with a non-upstream patch which creates PCI devices similar to VMD: "PCI: Add Intel remapped NVMe device support" (https://marc.info/?l=linux-ide&m=156015271021615&w=2) This patch has now been tested on VMD forced into identity mode. drivers/iommu/intel-iommu.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9dc37672bf89..7ffd252bf835 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3582,12 +3582,16 @@ static struct dmar_domain *get_private_domain_for_dev(struct device *dev) /* Check if the dev needs to go through non-identity map and unmap process.*/ static bool iommu_need_mapping(struct device *dev) { + struct device *dma_dev = dev; int ret; if (iommu_dummy(dev)) return false; - ret = identity_mapping(dev); + if (dev_is_pci(dev)) + dma_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev; + + ret = identity_mapping(dma_dev); if (ret) { u64 dma_mask = *dev->dma_mask; @@ -3601,19 +3605,19 @@ static bool iommu_need_mapping(struct device *dev) * 32 bit DMA is removed from si_domain and fall back to * non-identity mapping. */ - dmar_remove_one_dev_info(dev); - ret = iommu_request_dma_domain_for_dev(dev); + dmar_remove_one_dev_info(dma_dev); + ret = iommu_request_dma_domain_for_dev(dma_dev); if (ret) { struct iommu_domain *domain; struct dmar_domain *dmar_domain; - domain = iommu_get_domain_for_dev(dev); + domain = iommu_get_domain_for_dev(dma_dev); if (domain) { dmar_domain = to_dmar_domain(domain); dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; } - dmar_remove_one_dev_info(dev); - get_private_domain_for_dev(dev); + dmar_remove_one_dev_info(dma_dev); + get_private_domain_for_dev(dma_dev); } dev_info(dev, "32bit DMA uses non-identity mapping\n"); -- 2.20.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed 2020-02-19 3:21 [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed Daniel Drake @ 2020-02-19 3:40 ` Lu Baolu 2020-02-20 3:36 ` Daniel Drake 0 siblings, 1 reply; 8+ messages in thread From: Lu Baolu @ 2020-02-19 3:40 UTC (permalink / raw) To: Daniel Drake, dwmw2, joro; +Cc: bhelgaas, iommu, linux, jonathan.derrick Hi, On 2020/2/19 11:21, Daniel Drake wrote: > From: Jon Derrick<jonathan.derrick@intel.com> > > The PCI devices handled by intel-iommu may have a DMA requester on > another bus, such as VMD subdevices needing to use the VMD endpoint. > > The real DMA device is now used for the DMA mapping, but one case was > missed earlier: if the VMD device (and hence subdevices too) are under > IOMMU_DOMAIN_IDENTITY, mappings do not work. > > Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by > creating an iommu DMA mapping, and fall back on dma_direct_map_page() > for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY > case is broken when intel_page_page() handles a subdevice. > > We observe that at iommu attach time, dmar_insert_one_dev_info() for > the subdevices will never set dev->archdata.iommu. This is because > that function uses find_domain() to check if there is already an IOMMU > for the device, and find_domain() then defers to the real DMA device > which does have one. Thus dmar_insert_one_dev_info() returns without > assigning dev->archdata.iommu. > > Then, later: > > 1. intel_map_page() checks if an IOMMU mapping is needed by calling > iommu_need_mapping() on the subdevice. identity_mapping() returns > false because dev->archdata.iommu is NULL, so this function > returns false indicating that mapping is needed. > 2. __intel_map_single() is called to create the mapping. > 3. __intel_map_single() calls find_domain(). This function now returns > the IDENTITY domain corresponding to the real DMA device. > 4. __intel_map_single() calls domain_get_iommu() on this "real" domain. > A failure is hit and the entire operation is aborted, because this > codepath is not intended to handle IDENTITY mappings: > if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA)) > return NULL; > > Fix this by using the real DMA device when checking if a mapping is > needed. The IDENTITY case will then directly fall back on > dma_direct_map_page(). The subdevice DMA mask is still considered in > order to handle any situations where (e.g.) the subdevice only supports > 32-bit DMA with the real DMA requester having a 64-bit DMA mask. With respect, this is problematical. The parent and all subdevices share a single translation entry. The DMA mask should be consistent. Otherwise, for example, subdevice A has 64-bit DMA capability and uses an identity domain for DMA translation. While subdevice B has 32-bit DMA capability and is forced to switch to DMA domain. Subdevice A will be impacted without any notification. Best regards, baolu _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed 2020-02-19 3:40 ` Lu Baolu @ 2020-02-20 3:36 ` Daniel Drake 2020-02-20 10:06 ` Daniel Drake 0 siblings, 1 reply; 8+ messages in thread From: Daniel Drake @ 2020-02-20 3:36 UTC (permalink / raw) To: Lu Baolu Cc: David Woodhouse, iommu, Bjorn Helgaas, Linux Upstreaming Team, Jon Derrick On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote: > With respect, this is problematical. The parent and all subdevices share > a single translation entry. The DMA mask should be consistent. > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses > an identity domain for DMA translation. While subdevice B has 32-bit DMA > capability and is forced to switch to DMA domain. Subdevice A will be > impacted without any notification. I see what you mean. Perhaps we should just ensure that setups involving such real DMA devices and subdevices should always use the DMA domain, avoiding this type of complication. That's apparently even the default for VMD. This is probably something that should be forced/ensured when the real DMA device gets registered, because similarly to the noted case, we can't risk any identity mappings having been created on the real device if we later decide to move it into the DMA domain based on the appearance of subdevices. Jon, any thoughts? Daniel _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed 2020-02-20 3:36 ` Daniel Drake @ 2020-02-20 10:06 ` Daniel Drake 2020-02-20 11:58 ` Lu Baolu 2020-04-09 0:16 ` Derrick, Jonathan 0 siblings, 2 replies; 8+ messages in thread From: Daniel Drake @ 2020-02-20 10:06 UTC (permalink / raw) To: baolu.lu; +Cc: bhelgaas, linux, iommu, dwmw2, jonathan.derrick > On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote: > > With respect, this is problematical. The parent and all subdevices share > > a single translation entry. The DMA mask should be consistent. > > > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses > > an identity domain for DMA translation. While subdevice B has 32-bit DMA > > capability and is forced to switch to DMA domain. Subdevice A will be > > impacted without any notification. Looking closer, this problematic codepath may already be happening for VMD, under intel_iommu_add_device(). Consider this function running for a VMD subdevice, we hit: domain = iommu_get_domain_for_dev(dev); I can't quite grasp the code flow here, but domain->type now always seems to return the domain type of the real DMA device, which seems like pretty reasonable behaviour. if (domain->type == IOMMU_DOMAIN_DMA) { and as detailed in previous mails, the real VMD device seems to be in a DMA domain by default, so we continue. if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) { Now we checked the default domain type of the subdevice. This seems rather likely to return IDENTITY because that's effectively the default type... ret = iommu_request_dm_for_dev(dev); if (ret) { dmar_remove_one_dev_info(dev); dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; domain_add_dev_info(si_domain, dev); dev_info(dev, "Device uses a private identity domain.\n"); } } and now we're doing the bad stuff that Lu pointed out: we only have one mapping shared for all the subdevices, so if we end up changing it for one subdevice, we're likely to be breaking another. In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything and the following private identity code fortunately seems to have no consequential effects - the real DMA device continues to operate in the DMA domain, and all subdevice DMA requests go through the DMA mapping codepath. That's probably why VMD appears to be working fine anyway, but this seems fragile. The following changes enforce that the real DMA device is in the DMA domain, and avoid the intel_iommu_add_device() codepaths that would try to change it to a different domain type. Let me know if I'm on the right lines... --- drivers/iommu/intel-iommu.c | 16 ++++++++++++++++ drivers/pci/controller/intel-nvme-remap.c | 6 ++++++ 2 files changed, 22 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9644a5b3e0ae..8872b8d1780d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev) if (dev_is_pci(dev)) { struct pci_dev *pdev = to_pci_dev(dev); + if (pci_real_dma_dev(pdev) != pdev) + return IOMMU_DOMAIN_DMA; + if (device_is_rmrr_locked(dev)) return IOMMU_DOMAIN_DMA; @@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap) static int intel_iommu_add_device(struct device *dev) { + struct device *real_dev = dev; struct dmar_domain *dmar_domain; struct iommu_domain *domain; struct intel_iommu *iommu; @@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev) if (!iommu) return -ENODEV; + if (dev_is_pci(dev)) + real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev; + + if (real_dev != dev) { + domain = iommu_get_domain_for_dev(real_dev); + if (domain->type != IOMMU_DOMAIN_DMA) { + dev_err(dev, "Real DMA device not in DMA domain; can't handle DMA\n"); + return -ENODEV; + } + } + iommu_device_link(&iommu->iommu, dev); if (translation_pre_enabled(iommu)) _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed 2020-02-20 10:06 ` Daniel Drake @ 2020-02-20 11:58 ` Lu Baolu 2020-02-27 18:19 ` Derrick, Jonathan 2020-04-09 0:16 ` Derrick, Jonathan 1 sibling, 1 reply; 8+ messages in thread From: Lu Baolu @ 2020-02-20 11:58 UTC (permalink / raw) To: Daniel Drake; +Cc: bhelgaas, linux, iommu, dwmw2, jonathan.derrick Hi, On 2020/2/20 18:06, Daniel Drake wrote: >> On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu<baolu.lu@linux.intel.com> wrote: >>> With respect, this is problematical. The parent and all subdevices share >>> a single translation entry. The DMA mask should be consistent. >>> >>> Otherwise, for example, subdevice A has 64-bit DMA capability and uses >>> an identity domain for DMA translation. While subdevice B has 32-bit DMA >>> capability and is forced to switch to DMA domain. Subdevice A will be >>> impacted without any notification. > Looking closer, this problematic codepath may already be happening for VMD, > under intel_iommu_add_device(). Consider this function running for a VMD > subdevice, we hit: > > domain = iommu_get_domain_for_dev(dev); > > I can't quite grasp the code flow here, but domain->type now always seems > to return the domain type of the real DMA device, which seems like pretty > reasonable behaviour. > > if (domain->type == IOMMU_DOMAIN_DMA) { > > and as detailed in previous mails, the real VMD device seems to be in a DMA > domain by default, so we continue. > > if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) { > > Now we checked the default domain type of the subdevice. This seems rather > likely to return IDENTITY because that's effectively the default type... > > ret = iommu_request_dm_for_dev(dev); > if (ret) { > dmar_remove_one_dev_info(dev); > dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; > domain_add_dev_info(si_domain, dev); > dev_info(dev, > "Device uses a private identity domain.\n"); > } > } > > and now we're doing the bad stuff that Lu pointed out: we only have one > mapping shared for all the subdevices, so if we end up changing it for one > subdevice, we're likely to be breaking another. Yes. Agreed. By the way, do all subdevices stay in a same iommu group? Best regards, baolu _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed 2020-02-20 11:58 ` Lu Baolu @ 2020-02-27 18:19 ` Derrick, Jonathan 0 siblings, 0 replies; 8+ messages in thread From: Derrick, Jonathan @ 2020-02-27 18:19 UTC (permalink / raw) To: baolu.lu, drake; +Cc: bhelgaas, dwmw2, iommu, linux Hi Baolu, Daniel, Sorry for the delay. Was offline for the last week. On Thu, 2020-02-20 at 19:58 +0800, Lu Baolu wrote: > Hi, > > On 2020/2/20 18:06, Daniel Drake wrote: > > > On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu<baolu.lu@linux.intel.com> wrote: > > > > With respect, this is problematical. The parent and all subdevices share > > > > a single translation entry. The DMA mask should be consistent. > > > > > > > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses > > > > an identity domain for DMA translation. While subdevice B has 32-bit DMA > > > > capability and is forced to switch to DMA domain. Subdevice A will be > > > > impacted without any notification. > > Looking closer, this problematic codepath may already be happening for VMD, > > under intel_iommu_add_device(). Consider this function running for a VMD > > subdevice, we hit: > > > > domain = iommu_get_domain_for_dev(dev); > > > > I can't quite grasp the code flow here, but domain->type now always seems > > to return the domain type of the real DMA device, which seems like pretty > > reasonable behaviour. > > > > if (domain->type == IOMMU_DOMAIN_DMA) { > > > > and as detailed in previous mails, the real VMD device seems to be in a DMA > > domain by default, so we continue. > > > > if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) { > > > > Now we checked the default domain type of the subdevice. This seems rather > > likely to return IDENTITY because that's effectively the default type... > > > > ret = iommu_request_dm_for_dev(dev); > > if (ret) { > > dmar_remove_one_dev_info(dev); > > dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; > > domain_add_dev_info(si_domain, dev); > > dev_info(dev, > > "Device uses a private identity domain.\n"); > > } > > } > > > > and now we're doing the bad stuff that Lu pointed out: we only have one > > mapping shared for all the subdevices, so if we end up changing it for one > > subdevice, we're likely to be breaking another. > > Yes. Agreed. > > By the way, do all subdevices stay in a same iommu group? > > Best regards, > baolu The VMD endpoint and all subdevices in the VMD domain are in the same IOMMU group. The real dma device for VMD (the VMD endpoint) only represents the DMA requester as determined by the PCIe source-id. The VMD endpoint itself doesn't have a DMA engine so only the subdevices matter for mapping. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed 2020-02-20 10:06 ` Daniel Drake 2020-02-20 11:58 ` Lu Baolu @ 2020-04-09 0:16 ` Derrick, Jonathan 2020-04-09 0:33 ` Derrick, Jonathan 1 sibling, 1 reply; 8+ messages in thread From: Derrick, Jonathan @ 2020-04-09 0:16 UTC (permalink / raw) To: baolu.lu, drake; +Cc: bhelgaas, dwmw2, iommu, linux Hi Daniel, Reviving this thread On Thu, 2020-02-20 at 18:06 +0800, Daniel Drake wrote: > > On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote: > > > With respect, this is problematical. The parent and all subdevices share > > > a single translation entry. The DMA mask should be consistent. > > > > > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses > > > an identity domain for DMA translation. While subdevice B has 32-bit DMA > > > capability and is forced to switch to DMA domain. Subdevice A will be > > > impacted without any notification. > > Looking closer, this problematic codepath may already be happening for VMD, > under intel_iommu_add_device(). Consider this function running for a VMD > subdevice, we hit: > > domain = iommu_get_domain_for_dev(dev); > > I can't quite grasp the code flow here, but domain->type now always seems > to return the domain type of the real DMA device, which seems like pretty > reasonable behaviour. > > if (domain->type == IOMMU_DOMAIN_DMA) { > > and as detailed in previous mails, the real VMD device seems to be in a DMA > domain by default, so we continue. > > if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) { > > Now we checked the default domain type of the subdevice. This seems rather > likely to return IDENTITY because that's effectively the default type... > > ret = iommu_request_dm_for_dev(dev); > if (ret) { > dmar_remove_one_dev_info(dev); > dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; > domain_add_dev_info(si_domain, dev); > dev_info(dev, > "Device uses a private identity domain.\n"); > } > } > > and now we're doing the bad stuff that Lu pointed out: we only have one > mapping shared for all the subdevices, so if we end up changing it for one > subdevice, we're likely to be breaking another. > In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything > and the following private identity code fortunately seems to have no > consequential effects - the real DMA device continues to operate in the DMA > domain, and all subdevice DMA requests go through the DMA mapping codepath. > That's probably why VMD appears to be working fine anyway, but this seems > fragile. > > The following changes enforce that the real DMA device is in the DMA domain, > and avoid the intel_iommu_add_device() codepaths that would try to change > it to a different domain type. Let me know if I'm on the right lines... > --- > drivers/iommu/intel-iommu.c | 16 ++++++++++++++++ > drivers/pci/controller/intel-nvme-remap.c | 6 ++++++ > 2 files changed, 22 insertions(+) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 9644a5b3e0ae..8872b8d1780d 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev) > if (dev_is_pci(dev)) { > struct pci_dev *pdev = to_pci_dev(dev); > > + if (pci_real_dma_dev(pdev) != pdev) > + return IOMMU_DOMAIN_DMA; > + > if (device_is_rmrr_locked(dev)) > return IOMMU_DOMAIN_DMA; > > @@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap) > > static int intel_iommu_add_device(struct device *dev) > { > + struct device *real_dev = dev; > struct dmar_domain *dmar_domain; > struct iommu_domain *domain; > struct intel_iommu *iommu; > @@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev) > if (!iommu) > return -ENODEV; > > + if (dev_is_pci(dev)) > + real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev; > + > + if (real_dev != dev) { > + domain = iommu_get_domain_for_dev(real_dev); > + if (domain->type != IOMMU_DOMAIN_DMA) { > + dev_err(dev, "Real DMA device not in DMA domain; can't handle DMA\n"); > + return -ENODEV; > + } > + } > + > iommu_device_link(&iommu->iommu, dev); > > if (translation_pre_enabled(iommu)) > We need one additional change to enforce IOMMU_DOMAIN_DMA on the real dma dev, otherwise it could be put into Identity and the subdevices as DMA leading to this WARN: struct intel_iommu *domain_get_iommu(struct dmar_domain *domain) { int iommu_id; /* si_domain and vm domain should not get here. */ if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA)) return NULL; We could probably define and enforce it in device_def_domain_type. We could also try moving real dma dev to DMA on the first subdevice, like below. Though there might be a few cases we can't do that. ndex 3851204f6ac0..6c80c6c9d808 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5783,13 +5783,32 @@ static bool intel_iommu_capable(enum iommu_cap cap) return false; } +static int intel_iommu_request_dma_domain_for_dev(struct device *dev, + struct dmar_domain *domain) +{ + int ret; + + ret = iommu_request_dma_domain_for_dev(dev); + if (ret) { + dmar_remove_one_dev_info(dev); + domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; + if (!get_private_domain_for_dev(dev)) { + dev_warn(dev, + "Failed to get a private domain.\n"); + return -ENOMEM; + } + } + + return 0; +} + static int intel_iommu_add_device(struct device *dev) { - struct device *real_dev = dev; struct dmar_domain *dmar_domain; struct iommu_domain *domain; struct intel_iommu *iommu; struct iommu_group *group; + struct device *real_dev = dev; u8 bus, devfn; int ret; @@ -5797,18 +5816,6 @@ static int intel_iommu_add_device(struct device *dev) if (!iommu) return -ENODEV; - if (dev_is_pci(dev)) - real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev; - - if (real_dev != dev) { - domain = iommu_get_domain_for_dev(real_dev); - if (domain->type != IOMMU_DOMAIN_DMA) { - dmar_remove_one_dev_info(dev) - dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; - domain_add_dev_info(IOMMU_DOMAIN_DMA, dev); - } - } - iommu_device_link(&iommu->iommu, dev); if (translation_pre_enabled(iommu)) @@ -5825,6 +5832,21 @@ static int intel_iommu_add_device(struct device *dev) domain = iommu_get_domain_for_dev(dev); dmar_domain = to_dmar_domain(domain); + + if (dev_is_pci(dev)) + real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev; + + if (real_dev != dev) { + domain = iommu_get_domain_for_dev(real_dev); + if (domain->type != IOMMU_DOMAIN_DMA) { + dmar_remove_one_dev_info(real_dev); + + ret = intel_iommu_request_dma_domain_for_dev(real_dev, dmar_domain); + if (ret) + goto unlink; + } + } + if (domain->type == IOMMU_DOMAIN_DMA) { if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) { ret = iommu_request_dm_for_dev(dev); @@ -5838,20 +5860,12 @@ static int intel_iommu_add_device(struct device *dev) } } else { if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) { - ret = iommu_request_dma_domain_for_dev(dev); - if (ret) { - dmar_remove_one_dev_info(dev); - dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; - if (!get_private_domain_for_dev(dev)) { - dev_warn(dev, - "Failed to get a private domain.\n"); - ret = -ENOMEM; - goto unlink; - } + ret = intel_iommu_request_dma_domain_for_dev(dev, dmar_domain); + if (ret) + goto unlink; - dev_info(dev, - "Device uses a private dma domain.\n"); - } + dev_info(dev, + "Device uses a private dma domain.\n"); } } ~ ~ _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed 2020-04-09 0:16 ` Derrick, Jonathan @ 2020-04-09 0:33 ` Derrick, Jonathan 0 siblings, 0 replies; 8+ messages in thread From: Derrick, Jonathan @ 2020-04-09 0:33 UTC (permalink / raw) To: baolu.lu, drake; +Cc: bhelgaas, dwmw2, iommu, linux Hm that didn't come through right.. On Wed, 2020-04-08 at 18:16 -0600, Jonathan Derrick wrote: > Hi Daniel, > > Reviving this thread > > On Thu, 2020-02-20 at 18:06 +0800, Daniel Drake wrote: > > > On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote: > > > > With respect, this is problematical. The parent and all subdevices share > > > > a single translation entry. The DMA mask should be consistent. > > > > > > > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses > > > > an identity domain for DMA translation. While subdevice B has 32-bit DMA > > > > capability and is forced to switch to DMA domain. Subdevice A will be > > > > impacted without any notification. > > > > Looking closer, this problematic codepath may already be happening for VMD, > > under intel_iommu_add_device(). Consider this function running for a VMD > > subdevice, we hit: > > > > domain = iommu_get_domain_for_dev(dev); > > > > I can't quite grasp the code flow here, but domain->type now always seems > > to return the domain type of the real DMA device, which seems like pretty > > reasonable behaviour. > > > > if (domain->type == IOMMU_DOMAIN_DMA) { > > > > and as detailed in previous mails, the real VMD device seems to be in a DMA > > domain by default, so we continue. > > > > if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) { > > > > Now we checked the default domain type of the subdevice. This seems rather > > likely to return IDENTITY because that's effectively the default type... > > > > ret = iommu_request_dm_for_dev(dev); > > if (ret) { > > dmar_remove_one_dev_info(dev); > > dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; > > domain_add_dev_info(si_domain, dev); > > dev_info(dev, > > "Device uses a private identity domain.\n"); > > } > > } > > > > and now we're doing the bad stuff that Lu pointed out: we only have one > > mapping shared for all the subdevices, so if we end up changing it for one > > subdevice, we're likely to be breaking another. > > In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything > > and the following private identity code fortunately seems to have no > > consequential effects - the real DMA device continues to operate in the DMA > > domain, and all subdevice DMA requests go through the DMA mapping codepath. > > That's probably why VMD appears to be working fine anyway, but this seems > > fragile. > > > > The following changes enforce that the real DMA device is in the DMA domain, > > and avoid the intel_iommu_add_device() codepaths that would try to change > > it to a different domain type. Let me know if I'm on the right lines... > > --- > > drivers/iommu/intel-iommu.c | 16 ++++++++++++++++ > > drivers/pci/controller/intel-nvme-remap.c | 6 ++++++ > > 2 files changed, 22 insertions(+) > > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > > index 9644a5b3e0ae..8872b8d1780d 100644 > > --- a/drivers/iommu/intel-iommu.c > > +++ b/drivers/iommu/intel-iommu.c > > @@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev) > > if (dev_is_pci(dev)) { > > struct pci_dev *pdev = to_pci_dev(dev); > > > > + if (pci_real_dma_dev(pdev) != pdev) > > + return IOMMU_DOMAIN_DMA; > > + > > if (device_is_rmrr_locked(dev)) > > return IOMMU_DOMAIN_DMA; > > > > @@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap) > > > > static int intel_iommu_add_device(struct device *dev) > > { > > + struct device *real_dev = dev; > > struct dmar_domain *dmar_domain; > > struct iommu_domain *domain; > > struct intel_iommu *iommu; > > @@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev) > > if (!iommu) > > return -ENODEV; > > > > + if (dev_is_pci(dev)) > > + real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev; > > + > > + if (real_dev != dev) { > > + domain = iommu_get_domain_for_dev(real_dev); > > + if (domain->type != IOMMU_DOMAIN_DMA) { > > + dev_err(dev, "Real DMA device not in DMA domain; can't handle DMA\n"); > > + return -ENODEV; > > + } > > + } > > + > > iommu_device_link(&iommu->iommu, dev); > > > > if (translation_pre_enabled(iommu)) > > > > We need one additional change to enforce IOMMU_DOMAIN_DMA on the real > dma dev, otherwise it could be put into Identity and the subdevices as > DMA leading to this WARN: > > struct intel_iommu *domain_get_iommu(struct dmar_domain *domain) > { > int iommu_id; > > /* si_domain and vm domain should not get here. */ > if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA)) > return NULL; > > > We could probably define and enforce it in device_def_domain_type. We > could also try moving real dma dev to DMA on the first subdevice, like > below. Though there might be a few cases we can't do that. > > > [snip] diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 4be549478691..6c80c6c9d808 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3049,6 +3049,9 @@ static int device_def_domain_type(struct device *dev) if ((iommu_identity_mapping & IDENTMAP_GFX) && IS_GFX_DEVICE(pdev)) return IOMMU_DOMAIN_IDENTITY; + if (pci_real_dma_dev(pdev) != pdev) + return IOMMU_DOMAIN_DMA; + /* * We want to start off with all devices in the 1:1 domain, and * take them out later if we find they can't access all of memory. @@ -5780,12 +5783,32 @@ static bool intel_iommu_capable(enum iommu_cap cap) return false; } +static int intel_iommu_request_dma_domain_for_dev(struct device *dev, + struct dmar_domain *domain) +{ + int ret; + + ret = iommu_request_dma_domain_for_dev(dev); + if (ret) { + dmar_remove_one_dev_info(dev); + domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; + if (!get_private_domain_for_dev(dev)) { + dev_warn(dev, + "Failed to get a private domain.\n"); + return -ENOMEM; + } + } + + return 0; +} + static int intel_iommu_add_device(struct device *dev) { struct dmar_domain *dmar_domain; struct iommu_domain *domain; struct intel_iommu *iommu; struct iommu_group *group; + struct device *real_dev = dev; u8 bus, devfn; int ret; @@ -5809,6 +5832,21 @@ static int intel_iommu_add_device(struct device *dev) domain = iommu_get_domain_for_dev(dev); dmar_domain = to_dmar_domain(domain); + + if (dev_is_pci(dev)) + real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev; + + if (real_dev != dev) { + domain = iommu_get_domain_for_dev(real_dev); + if (domain->type != IOMMU_DOMAIN_DMA) { + dmar_remove_one_dev_info(real_dev); + + ret = intel_iommu_request_dma_domain_for_dev(real_dev, dmar_domain); + if (ret) + goto unlink; + } + } + if (domain->type == IOMMU_DOMAIN_DMA) { if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) { ret = iommu_request_dm_for_dev(dev); @@ -5822,20 +5860,12 @@ static int intel_iommu_add_device(struct device *dev) } } else { if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) { - ret = iommu_request_dma_domain_for_dev(dev); - if (ret) { - dmar_remove_one_dev_info(dev); - dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; - if (!get_private_domain_for_dev(dev)) { - dev_warn(dev, - "Failed to get a private domain.\n"); - ret = -ENOMEM; - goto unlink; - } + ret = intel_iommu_request_dma_domain_for_dev(dev, dmar_domain); + if (ret) + goto unlink; - dev_info(dev, - "Device uses a private dma domain.\n"); - } + dev_info(dev, + "Device uses a private dma domain.\n"); } } _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-04-09 0:33 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-19 3:21 [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed Daniel Drake 2020-02-19 3:40 ` Lu Baolu 2020-02-20 3:36 ` Daniel Drake 2020-02-20 10:06 ` Daniel Drake 2020-02-20 11:58 ` Lu Baolu 2020-02-27 18:19 ` Derrick, Jonathan 2020-04-09 0:16 ` Derrick, Jonathan 2020-04-09 0:33 ` Derrick, Jonathan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).