iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
@ 2020-02-19  3:21 Daniel Drake
  2020-02-19  3:40 ` Lu Baolu
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Drake @ 2020-02-19  3:21 UTC (permalink / raw)
  To: dwmw2, baolu.lu, joro; +Cc: bhelgaas, iommu, linux, jonathan.derrick

From: Jon Derrick <jonathan.derrick@intel.com>

The PCI devices handled by intel-iommu may have a DMA requester on
another bus, such as VMD subdevices needing to use the VMD endpoint.

The real DMA device is now used for the DMA mapping, but one case was
missed earlier: if the VMD device (and hence subdevices too) are under
IOMMU_DOMAIN_IDENTITY, mappings do not work.

Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by
creating an iommu DMA mapping, and fall back on dma_direct_map_page()
for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY
case is broken when intel_page_page() handles a subdevice.

We observe that at iommu attach time, dmar_insert_one_dev_info() for
the subdevices will never set dev->archdata.iommu. This is because
that function uses find_domain() to check if there is already an IOMMU
for the device, and find_domain() then defers to the real DMA device
which does have one. Thus dmar_insert_one_dev_info() returns without
assigning dev->archdata.iommu.

Then, later:

1. intel_map_page() checks if an IOMMU mapping is needed by calling
   iommu_need_mapping() on the subdevice. identity_mapping() returns
   false because dev->archdata.iommu is NULL, so this function
   returns false indicating that mapping is needed.
2. __intel_map_single() is called to create the mapping.
3. __intel_map_single() calls find_domain(). This function now returns
   the IDENTITY domain corresponding to the real DMA device.
4. __intel_map_single() calls domain_get_iommu() on this "real" domain.
   A failure is hit and the entire operation is aborted, because this
   codepath is not intended to handle IDENTITY mappings:
       if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
                   return NULL;

Fix this by using the real DMA device when checking if a mapping is
needed. The IDENTITY case will then directly fall back on
dma_direct_map_page(). The subdevice DMA mask is still considered in
order to handle any situations where (e.g.) the subdevice only supports
32-bit DMA with the real DMA requester having a 64-bit DMA mask.

Reported-by: Daniel Drake <drake@endlessm.com>
Fixes: b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
---

Notes:
    v2: switch to Jon's approach instead.
    v3: shortcut mask check in non-identity case
    v4: amend commit msg to explain why subdevice DMA mask is still considered
    
    This problem was originally detected with a non-upstream patch which
    creates PCI devices similar to VMD:
    "PCI: Add Intel remapped NVMe device support"
    (https://marc.info/?l=linux-ide&m=156015271021615&w=2)
    
    This patch has now been tested on VMD forced into identity mode.

 drivers/iommu/intel-iommu.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9dc37672bf89..7ffd252bf835 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3582,12 +3582,16 @@ static struct dmar_domain *get_private_domain_for_dev(struct device *dev)
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static bool iommu_need_mapping(struct device *dev)
 {
+	struct device *dma_dev = dev;
 	int ret;
 
 	if (iommu_dummy(dev))
 		return false;
 
-	ret = identity_mapping(dev);
+	if (dev_is_pci(dev))
+		dma_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
+
+	ret = identity_mapping(dma_dev);
 	if (ret) {
 		u64 dma_mask = *dev->dma_mask;
 
@@ -3601,19 +3605,19 @@ static bool iommu_need_mapping(struct device *dev)
 		 * 32 bit DMA is removed from si_domain and fall back to
 		 * non-identity mapping.
 		 */
-		dmar_remove_one_dev_info(dev);
-		ret = iommu_request_dma_domain_for_dev(dev);
+		dmar_remove_one_dev_info(dma_dev);
+		ret = iommu_request_dma_domain_for_dev(dma_dev);
 		if (ret) {
 			struct iommu_domain *domain;
 			struct dmar_domain *dmar_domain;
 
-			domain = iommu_get_domain_for_dev(dev);
+			domain = iommu_get_domain_for_dev(dma_dev);
 			if (domain) {
 				dmar_domain = to_dmar_domain(domain);
 				dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
 			}
-			dmar_remove_one_dev_info(dev);
-			get_private_domain_for_dev(dev);
+			dmar_remove_one_dev_info(dma_dev);
+			get_private_domain_for_dev(dma_dev);
 		}
 
 		dev_info(dev, "32bit DMA uses non-identity mapping\n");
-- 
2.20.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
  2020-02-19  3:21 [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed Daniel Drake
@ 2020-02-19  3:40 ` Lu Baolu
  2020-02-20  3:36   ` Daniel Drake
  0 siblings, 1 reply; 8+ messages in thread
From: Lu Baolu @ 2020-02-19  3:40 UTC (permalink / raw)
  To: Daniel Drake, dwmw2, joro; +Cc: bhelgaas, iommu, linux, jonathan.derrick

Hi,

On 2020/2/19 11:21, Daniel Drake wrote:
> From: Jon Derrick<jonathan.derrick@intel.com>
> 
> The PCI devices handled by intel-iommu may have a DMA requester on
> another bus, such as VMD subdevices needing to use the VMD endpoint.
> 
> The real DMA device is now used for the DMA mapping, but one case was
> missed earlier: if the VMD device (and hence subdevices too) are under
> IOMMU_DOMAIN_IDENTITY, mappings do not work.
> 
> Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by
> creating an iommu DMA mapping, and fall back on dma_direct_map_page()
> for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY
> case is broken when intel_page_page() handles a subdevice.
> 
> We observe that at iommu attach time, dmar_insert_one_dev_info() for
> the subdevices will never set dev->archdata.iommu. This is because
> that function uses find_domain() to check if there is already an IOMMU
> for the device, and find_domain() then defers to the real DMA device
> which does have one. Thus dmar_insert_one_dev_info() returns without
> assigning dev->archdata.iommu.
> 
> Then, later:
> 
> 1. intel_map_page() checks if an IOMMU mapping is needed by calling
>     iommu_need_mapping() on the subdevice. identity_mapping() returns
>     false because dev->archdata.iommu is NULL, so this function
>     returns false indicating that mapping is needed.
> 2. __intel_map_single() is called to create the mapping.
> 3. __intel_map_single() calls find_domain(). This function now returns
>     the IDENTITY domain corresponding to the real DMA device.
> 4. __intel_map_single() calls domain_get_iommu() on this "real" domain.
>     A failure is hit and the entire operation is aborted, because this
>     codepath is not intended to handle IDENTITY mappings:
>         if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
>                     return NULL;
> 
> Fix this by using the real DMA device when checking if a mapping is
> needed. The IDENTITY case will then directly fall back on
> dma_direct_map_page(). The subdevice DMA mask is still considered in
> order to handle any situations where (e.g.) the subdevice only supports
> 32-bit DMA with the real DMA requester having a 64-bit DMA mask.

With respect, this is problematical. The parent and all subdevices share
a single translation entry. The DMA mask should be consistent.

Otherwise, for example, subdevice A has 64-bit DMA capability and uses
an identity domain for DMA translation. While subdevice B has 32-bit DMA
capability and is forced to switch to DMA domain. Subdevice A will be
impacted without any notification.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
  2020-02-19  3:40 ` Lu Baolu
@ 2020-02-20  3:36   ` Daniel Drake
  2020-02-20 10:06     ` Daniel Drake
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Drake @ 2020-02-20  3:36 UTC (permalink / raw)
  To: Lu Baolu
  Cc: David Woodhouse, iommu, Bjorn Helgaas, Linux Upstreaming Team,
	Jon Derrick

On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote:
> With respect, this is problematical. The parent and all subdevices share
> a single translation entry. The DMA mask should be consistent.
>
> Otherwise, for example, subdevice A has 64-bit DMA capability and uses
> an identity domain for DMA translation. While subdevice B has 32-bit DMA
> capability and is forced to switch to DMA domain. Subdevice A will be
> impacted without any notification.

I see what you mean.

Perhaps we should just ensure that setups involving such real DMA
devices and subdevices should always use the DMA domain, avoiding this
type of complication. That's apparently even the default for VMD. This
is probably something that should be forced/ensured when the real DMA
device gets registered, because similarly to the noted case, we can't
risk any identity mappings having been created on the real device if
we later decide to move it into the DMA domain based on the appearance
of subdevices.

Jon, any thoughts?

Daniel
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
  2020-02-20  3:36   ` Daniel Drake
@ 2020-02-20 10:06     ` Daniel Drake
  2020-02-20 11:58       ` Lu Baolu
  2020-04-09  0:16       ` Derrick, Jonathan
  0 siblings, 2 replies; 8+ messages in thread
From: Daniel Drake @ 2020-02-20 10:06 UTC (permalink / raw)
  To: baolu.lu; +Cc: bhelgaas, linux, iommu, dwmw2, jonathan.derrick

> On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote:
> > With respect, this is problematical. The parent and all subdevices share
> > a single translation entry. The DMA mask should be consistent.
> >
> > Otherwise, for example, subdevice A has 64-bit DMA capability and uses
> > an identity domain for DMA translation. While subdevice B has 32-bit DMA
> > capability and is forced to switch to DMA domain. Subdevice A will be
> > impacted without any notification.

Looking closer, this problematic codepath may already be happening for VMD,
under intel_iommu_add_device(). Consider this function running for a VMD
subdevice, we hit:

    domain = iommu_get_domain_for_dev(dev);

I can't quite grasp the code flow here, but domain->type now always seems
to return the domain type of the real DMA device, which seems like pretty
reasonable behaviour.

    if (domain->type == IOMMU_DOMAIN_DMA) {

and as detailed in previous mails, the real VMD device seems to be in a DMA
domain by default, so we continue.

        if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {

Now we checked the default domain type of the subdevice. This seems rather
likely to return IDENTITY because that's effectively the default type...

            ret = iommu_request_dm_for_dev(dev);
            if (ret) {
                dmar_remove_one_dev_info(dev);
                dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
                domain_add_dev_info(si_domain, dev);
                dev_info(dev,
                     "Device uses a private identity domain.\n");
            }
        }

and now we're doing the bad stuff that Lu pointed out: we only have one
mapping shared for all the subdevices, so if we end up changing it for one
subdevice, we're likely to be breaking another.
In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything
and the following private identity code fortunately seems to have no
consequential effects - the real DMA device continues to operate in the DMA
domain, and all subdevice DMA requests go through the DMA mapping codepath.
That's probably why VMD appears to be working fine anyway, but this seems
fragile.

The following changes enforce that the real DMA device is in the DMA domain,
and avoid the intel_iommu_add_device() codepaths that would try to change
it to a different domain type. Let me know if I'm on the right lines...
---
 drivers/iommu/intel-iommu.c               | 16 ++++++++++++++++
 drivers/pci/controller/intel-nvme-remap.c |  6 ++++++
 2 files changed, 22 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9644a5b3e0ae..8872b8d1780d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev)
 	if (dev_is_pci(dev)) {
 		struct pci_dev *pdev = to_pci_dev(dev);
 
+		if (pci_real_dma_dev(pdev) != pdev)
+			return IOMMU_DOMAIN_DMA;
+
 		if (device_is_rmrr_locked(dev))
 			return IOMMU_DOMAIN_DMA;
 
@@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap)
 
 static int intel_iommu_add_device(struct device *dev)
 {
+	struct device *real_dev = dev;
 	struct dmar_domain *dmar_domain;
 	struct iommu_domain *domain;
 	struct intel_iommu *iommu;
@@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev)
 	if (!iommu)
 		return -ENODEV;
 
+	if (dev_is_pci(dev))
+		real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
+
+	if (real_dev != dev) {
+		domain = iommu_get_domain_for_dev(real_dev);
+		if (domain->type != IOMMU_DOMAIN_DMA) {
+			dev_err(dev, "Real DMA device not in DMA domain; can't handle DMA\n");
+			return -ENODEV;
+		}
+	}
+
 	iommu_device_link(&iommu->iommu, dev);
 
 	if (translation_pre_enabled(iommu))

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
  2020-02-20 10:06     ` Daniel Drake
@ 2020-02-20 11:58       ` Lu Baolu
  2020-02-27 18:19         ` Derrick, Jonathan
  2020-04-09  0:16       ` Derrick, Jonathan
  1 sibling, 1 reply; 8+ messages in thread
From: Lu Baolu @ 2020-02-20 11:58 UTC (permalink / raw)
  To: Daniel Drake; +Cc: bhelgaas, linux, iommu, dwmw2, jonathan.derrick

Hi,

On 2020/2/20 18:06, Daniel Drake wrote:
>> On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu<baolu.lu@linux.intel.com>  wrote:
>>> With respect, this is problematical. The parent and all subdevices share
>>> a single translation entry. The DMA mask should be consistent.
>>>
>>> Otherwise, for example, subdevice A has 64-bit DMA capability and uses
>>> an identity domain for DMA translation. While subdevice B has 32-bit DMA
>>> capability and is forced to switch to DMA domain. Subdevice A will be
>>> impacted without any notification.
> Looking closer, this problematic codepath may already be happening for VMD,
> under intel_iommu_add_device(). Consider this function running for a VMD
> subdevice, we hit:
> 
>      domain = iommu_get_domain_for_dev(dev);
> 
> I can't quite grasp the code flow here, but domain->type now always seems
> to return the domain type of the real DMA device, which seems like pretty
> reasonable behaviour.
> 
>      if (domain->type == IOMMU_DOMAIN_DMA) {
> 
> and as detailed in previous mails, the real VMD device seems to be in a DMA
> domain by default, so we continue.
> 
>          if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
> 
> Now we checked the default domain type of the subdevice. This seems rather
> likely to return IDENTITY because that's effectively the default type...
> 
>              ret = iommu_request_dm_for_dev(dev);
>              if (ret) {
>                  dmar_remove_one_dev_info(dev);
>                  dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
>                  domain_add_dev_info(si_domain, dev);
>                  dev_info(dev,
>                       "Device uses a private identity domain.\n");
>              }
>          }
> 
> and now we're doing the bad stuff that Lu pointed out: we only have one
> mapping shared for all the subdevices, so if we end up changing it for one
> subdevice, we're likely to be breaking another.

Yes. Agreed.

By the way, do all subdevices stay in a same iommu group?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
  2020-02-20 11:58       ` Lu Baolu
@ 2020-02-27 18:19         ` Derrick, Jonathan
  0 siblings, 0 replies; 8+ messages in thread
From: Derrick, Jonathan @ 2020-02-27 18:19 UTC (permalink / raw)
  To: baolu.lu, drake; +Cc: bhelgaas, dwmw2, iommu, linux

Hi Baolu, Daniel,

Sorry for the delay. Was offline for the last week.

On Thu, 2020-02-20 at 19:58 +0800, Lu Baolu wrote:
> Hi,
> 
> On 2020/2/20 18:06, Daniel Drake wrote:
> > > On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu<baolu.lu@linux.intel.com>  wrote:
> > > > With respect, this is problematical. The parent and all subdevices share
> > > > a single translation entry. The DMA mask should be consistent.
> > > > 
> > > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses
> > > > an identity domain for DMA translation. While subdevice B has 32-bit DMA
> > > > capability and is forced to switch to DMA domain. Subdevice A will be
> > > > impacted without any notification.
> > Looking closer, this problematic codepath may already be happening for VMD,
> > under intel_iommu_add_device(). Consider this function running for a VMD
> > subdevice, we hit:
> > 
> >      domain = iommu_get_domain_for_dev(dev);
> > 
> > I can't quite grasp the code flow here, but domain->type now always seems
> > to return the domain type of the real DMA device, which seems like pretty
> > reasonable behaviour.
> > 
> >      if (domain->type == IOMMU_DOMAIN_DMA) {
> > 
> > and as detailed in previous mails, the real VMD device seems to be in a DMA
> > domain by default, so we continue.
> > 
> >          if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
> > 
> > Now we checked the default domain type of the subdevice. This seems rather
> > likely to return IDENTITY because that's effectively the default type...
> > 
> >              ret = iommu_request_dm_for_dev(dev);
> >              if (ret) {
> >                  dmar_remove_one_dev_info(dev);
> >                  dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
> >                  domain_add_dev_info(si_domain, dev);
> >                  dev_info(dev,
> >                       "Device uses a private identity domain.\n");
> >              }
> >          }
> > 
> > and now we're doing the bad stuff that Lu pointed out: we only have one
> > mapping shared for all the subdevices, so if we end up changing it for one
> > subdevice, we're likely to be breaking another.
> 
> Yes. Agreed.
> 
> By the way, do all subdevices stay in a same iommu group?
> 
> Best regards,
> baolu


The VMD endpoint and all subdevices in the VMD domain are in the same
IOMMU group. The real dma device for VMD (the VMD endpoint) only
represents the DMA requester as determined by the PCIe source-id. The
VMD endpoint itself doesn't have a DMA engine so only the subdevices
matter for mapping.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
  2020-02-20 10:06     ` Daniel Drake
  2020-02-20 11:58       ` Lu Baolu
@ 2020-04-09  0:16       ` Derrick, Jonathan
  2020-04-09  0:33         ` Derrick, Jonathan
  1 sibling, 1 reply; 8+ messages in thread
From: Derrick, Jonathan @ 2020-04-09  0:16 UTC (permalink / raw)
  To: baolu.lu, drake; +Cc: bhelgaas, dwmw2, iommu, linux

Hi Daniel,

Reviving this thread

On Thu, 2020-02-20 at 18:06 +0800, Daniel Drake wrote:
> > On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote:
> > > With respect, this is problematical. The parent and all subdevices share
> > > a single translation entry. The DMA mask should be consistent.
> > > 
> > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses
> > > an identity domain for DMA translation. While subdevice B has 32-bit DMA
> > > capability and is forced to switch to DMA domain. Subdevice A will be
> > > impacted without any notification.
> 
> Looking closer, this problematic codepath may already be happening for VMD,
> under intel_iommu_add_device(). Consider this function running for a VMD
> subdevice, we hit:
> 
>     domain = iommu_get_domain_for_dev(dev);
> 
> I can't quite grasp the code flow here, but domain->type now always seems
> to return the domain type of the real DMA device, which seems like pretty
> reasonable behaviour.
> 
>     if (domain->type == IOMMU_DOMAIN_DMA) {
> 
> and as detailed in previous mails, the real VMD device seems to be in a DMA
> domain by default, so we continue.
> 
>         if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
> 
> Now we checked the default domain type of the subdevice. This seems rather
> likely to return IDENTITY because that's effectively the default type...
> 
>             ret = iommu_request_dm_for_dev(dev);
>             if (ret) {
>                 dmar_remove_one_dev_info(dev);
>                 dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
>                 domain_add_dev_info(si_domain, dev);
>                 dev_info(dev,
>                      "Device uses a private identity domain.\n");
>             }
>         }
> 
> and now we're doing the bad stuff that Lu pointed out: we only have one
> mapping shared for all the subdevices, so if we end up changing it for one
> subdevice, we're likely to be breaking another.
> In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything
> and the following private identity code fortunately seems to have no
> consequential effects - the real DMA device continues to operate in the DMA
> domain, and all subdevice DMA requests go through the DMA mapping codepath.
> That's probably why VMD appears to be working fine anyway, but this seems
> fragile.
> 
> The following changes enforce that the real DMA device is in the DMA domain,
> and avoid the intel_iommu_add_device() codepaths that would try to change
> it to a different domain type. Let me know if I'm on the right lines...
> ---
>  drivers/iommu/intel-iommu.c               | 16 ++++++++++++++++
>  drivers/pci/controller/intel-nvme-remap.c |  6 ++++++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 9644a5b3e0ae..8872b8d1780d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev)
>  	if (dev_is_pci(dev)) {
>  		struct pci_dev *pdev = to_pci_dev(dev);
>  
> +		if (pci_real_dma_dev(pdev) != pdev)
> +			return IOMMU_DOMAIN_DMA;
> +
>  		if (device_is_rmrr_locked(dev))
>  			return IOMMU_DOMAIN_DMA;
>  
> @@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap)
>  
>  static int intel_iommu_add_device(struct device *dev)
>  {
> +	struct device *real_dev = dev;
>  	struct dmar_domain *dmar_domain;
>  	struct iommu_domain *domain;
>  	struct intel_iommu *iommu;
> @@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev)
>  	if (!iommu)
>  		return -ENODEV;
>  
> +	if (dev_is_pci(dev))
> +		real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
> +
> +	if (real_dev != dev) {
> +		domain = iommu_get_domain_for_dev(real_dev);
> +		if (domain->type != IOMMU_DOMAIN_DMA) {
> +			dev_err(dev, "Real DMA device not in DMA domain; can't handle DMA\n");
> +			return -ENODEV;
> +		}
> +	}
> +
>  	iommu_device_link(&iommu->iommu, dev);
>  
>  	if (translation_pre_enabled(iommu))
> 


We need one additional change to enforce IOMMU_DOMAIN_DMA on the real
dma dev, otherwise it could be put into Identity and the subdevices as
DMA leading to this WARN:

struct intel_iommu *domain_get_iommu(struct dmar_domain *domain)
{
        int iommu_id;

        /* si_domain and vm domain should not get here. */
        if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
                return NULL;


We could probably define and enforce it in device_def_domain_type. We
could also try moving real dma dev to DMA on the first subdevice, like
below. Though there might be a few cases we can't do that.



ndex 3851204f6ac0..6c80c6c9d808 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5783,13 +5783,32 @@ static bool intel_iommu_capable(enum iommu_cap cap)
        return false;
 }
 
+static int intel_iommu_request_dma_domain_for_dev(struct device *dev,
+                                                  struct dmar_domain *domain)
+{
+       int ret;
+
+       ret = iommu_request_dma_domain_for_dev(dev);
+       if (ret) {
+               dmar_remove_one_dev_info(dev);
+               domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
+               if (!get_private_domain_for_dev(dev)) {
+                       dev_warn(dev,
+                                "Failed to get a private domain.\n");
+                               return -ENOMEM;
+               }
+       }
+
+       return 0;
+}
+
 static int intel_iommu_add_device(struct device *dev)
 {
-       struct device *real_dev = dev;
        struct dmar_domain *dmar_domain;
        struct iommu_domain *domain;
        struct intel_iommu *iommu;
        struct iommu_group *group;
+       struct device *real_dev = dev;
        u8 bus, devfn;
        int ret;
 
@@ -5797,18 +5816,6 @@ static int intel_iommu_add_device(struct device *dev)
        if (!iommu)
                return -ENODEV;
 
-       if (dev_is_pci(dev))
-               real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
-
-       if (real_dev != dev) {
-               domain = iommu_get_domain_for_dev(real_dev);
-               if (domain->type != IOMMU_DOMAIN_DMA) {
-                       dmar_remove_one_dev_info(dev)
-                       dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
-                       domain_add_dev_info(IOMMU_DOMAIN_DMA, dev);
-               }
-       }
-
        iommu_device_link(&iommu->iommu, dev);
 
        if (translation_pre_enabled(iommu))
@@ -5825,6 +5832,21 @@ static int intel_iommu_add_device(struct device *dev)
 
        domain = iommu_get_domain_for_dev(dev);
        dmar_domain = to_dmar_domain(domain);
+
+       if (dev_is_pci(dev))
+               real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
+
+       if (real_dev != dev) {
+               domain = iommu_get_domain_for_dev(real_dev);
+               if (domain->type != IOMMU_DOMAIN_DMA) {
+                       dmar_remove_one_dev_info(real_dev);
+
+                       ret = intel_iommu_request_dma_domain_for_dev(real_dev, dmar_domain);
+                       if (ret)
+                               goto unlink;
+               }
+       }
+
        if (domain->type == IOMMU_DOMAIN_DMA) {
                if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
                        ret = iommu_request_dm_for_dev(dev);
@@ -5838,20 +5860,12 @@ static int intel_iommu_add_device(struct device *dev)
                }
        } else {
                if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
-                       ret = iommu_request_dma_domain_for_dev(dev);
-                       if (ret) {
-                               dmar_remove_one_dev_info(dev);
-                               dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
-                               if (!get_private_domain_for_dev(dev)) {
-                                       dev_warn(dev,
-                                                "Failed to get a private domain.\n");
-                                       ret = -ENOMEM;
-                                       goto unlink;
-                               }
+                       ret = intel_iommu_request_dma_domain_for_dev(dev, dmar_domain);
+                       if (ret)
+                               goto unlink;
 
-                               dev_info(dev,
-                                        "Device uses a private dma domain.\n");
-                       }
+                       dev_info(dev,
+                                "Device uses a private dma domain.\n");
                }
        }
 
~
~
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
  2020-04-09  0:16       ` Derrick, Jonathan
@ 2020-04-09  0:33         ` Derrick, Jonathan
  0 siblings, 0 replies; 8+ messages in thread
From: Derrick, Jonathan @ 2020-04-09  0:33 UTC (permalink / raw)
  To: baolu.lu, drake; +Cc: bhelgaas, dwmw2, iommu, linux

Hm that didn't come through right..

On Wed, 2020-04-08 at 18:16 -0600, Jonathan Derrick wrote:
> Hi Daniel,
> 
> Reviving this thread
> 
> On Thu, 2020-02-20 at 18:06 +0800, Daniel Drake wrote:
> > > On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote:
> > > > With respect, this is problematical. The parent and all subdevices share
> > > > a single translation entry. The DMA mask should be consistent.
> > > > 
> > > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses
> > > > an identity domain for DMA translation. While subdevice B has 32-bit DMA
> > > > capability and is forced to switch to DMA domain. Subdevice A will be
> > > > impacted without any notification.
> > 
> > Looking closer, this problematic codepath may already be happening for VMD,
> > under intel_iommu_add_device(). Consider this function running for a VMD
> > subdevice, we hit:
> > 
> >     domain = iommu_get_domain_for_dev(dev);
> > 
> > I can't quite grasp the code flow here, but domain->type now always seems
> > to return the domain type of the real DMA device, which seems like pretty
> > reasonable behaviour.
> > 
> >     if (domain->type == IOMMU_DOMAIN_DMA) {
> > 
> > and as detailed in previous mails, the real VMD device seems to be in a DMA
> > domain by default, so we continue.
> > 
> >         if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
> > 
> > Now we checked the default domain type of the subdevice. This seems rather
> > likely to return IDENTITY because that's effectively the default type...
> > 
> >             ret = iommu_request_dm_for_dev(dev);
> >             if (ret) {
> >                 dmar_remove_one_dev_info(dev);
> >                 dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
> >                 domain_add_dev_info(si_domain, dev);
> >                 dev_info(dev,
> >                      "Device uses a private identity domain.\n");
> >             }
> >         }
> > 
> > and now we're doing the bad stuff that Lu pointed out: we only have one
> > mapping shared for all the subdevices, so if we end up changing it for one
> > subdevice, we're likely to be breaking another.
> > In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything
> > and the following private identity code fortunately seems to have no
> > consequential effects - the real DMA device continues to operate in the DMA
> > domain, and all subdevice DMA requests go through the DMA mapping codepath.
> > That's probably why VMD appears to be working fine anyway, but this seems
> > fragile.
> > 
> > The following changes enforce that the real DMA device is in the DMA domain,
> > and avoid the intel_iommu_add_device() codepaths that would try to change
> > it to a different domain type. Let me know if I'm on the right lines...
> > ---
> >  drivers/iommu/intel-iommu.c               | 16 ++++++++++++++++
> >  drivers/pci/controller/intel-nvme-remap.c |  6 ++++++
> >  2 files changed, 22 insertions(+)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > index 9644a5b3e0ae..8872b8d1780d 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev)
> >  	if (dev_is_pci(dev)) {
> >  		struct pci_dev *pdev = to_pci_dev(dev);
> >  
> > +		if (pci_real_dma_dev(pdev) != pdev)
> > +			return IOMMU_DOMAIN_DMA;
> > +
> >  		if (device_is_rmrr_locked(dev))
> >  			return IOMMU_DOMAIN_DMA;
> >  
> > @@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap)
> >  
> >  static int intel_iommu_add_device(struct device *dev)
> >  {
> > +	struct device *real_dev = dev;
> >  	struct dmar_domain *dmar_domain;
> >  	struct iommu_domain *domain;
> >  	struct intel_iommu *iommu;
> > @@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev)
> >  	if (!iommu)
> >  		return -ENODEV;
> >  
> > +	if (dev_is_pci(dev))
> > +		real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
> > +
> > +	if (real_dev != dev) {
> > +		domain = iommu_get_domain_for_dev(real_dev);
> > +		if (domain->type != IOMMU_DOMAIN_DMA) {
> > +			dev_err(dev, "Real DMA device not in DMA domain; can't handle DMA\n");
> > +			return -ENODEV;
> > +		}
> > +	}
> > +
> >  	iommu_device_link(&iommu->iommu, dev);
> >  
> >  	if (translation_pre_enabled(iommu))
> > 
> 
> We need one additional change to enforce IOMMU_DOMAIN_DMA on the real
> dma dev, otherwise it could be put into Identity and the subdevices as
> DMA leading to this WARN:
> 
> struct intel_iommu *domain_get_iommu(struct dmar_domain *domain)
> {
>         int iommu_id;
> 
>         /* si_domain and vm domain should not get here. */
>         if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
>                 return NULL;
> 
> 
> We could probably define and enforce it in device_def_domain_type. We
> could also try moving real dma dev to DMA on the first subdevice, like
> below. Though there might be a few cases we can't do that.
> 
> 
> [snip]

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4be549478691..6c80c6c9d808 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3049,6 +3049,9 @@ static int device_def_domain_type(struct device *dev)
                if ((iommu_identity_mapping & IDENTMAP_GFX) && IS_GFX_DEVICE(pdev))
                        return IOMMU_DOMAIN_IDENTITY;
 
+               if (pci_real_dma_dev(pdev) != pdev)
+                       return IOMMU_DOMAIN_DMA;
+
                /*
                 * We want to start off with all devices in the 1:1 domain, and
                 * take them out later if we find they can't access all of memory.
@@ -5780,12 +5783,32 @@ static bool intel_iommu_capable(enum iommu_cap cap)
        return false;
 }
 
+static int intel_iommu_request_dma_domain_for_dev(struct device *dev,
+                                                  struct dmar_domain *domain)
+{
+       int ret;
+
+       ret = iommu_request_dma_domain_for_dev(dev);
+       if (ret) {
+               dmar_remove_one_dev_info(dev);
+               domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
+               if (!get_private_domain_for_dev(dev)) {
+                       dev_warn(dev,
+                                "Failed to get a private domain.\n");
+                               return -ENOMEM;
+               }
+       }
+
+       return 0;
+}
+
 static int intel_iommu_add_device(struct device *dev)
 {
        struct dmar_domain *dmar_domain;
        struct iommu_domain *domain;
        struct intel_iommu *iommu;
        struct iommu_group *group;
+       struct device *real_dev = dev;
        u8 bus, devfn;
        int ret;
 
@@ -5809,6 +5832,21 @@ static int intel_iommu_add_device(struct device *dev)
 
        domain = iommu_get_domain_for_dev(dev);
        dmar_domain = to_dmar_domain(domain);
+
+       if (dev_is_pci(dev))
+               real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
+
+       if (real_dev != dev) {
+               domain = iommu_get_domain_for_dev(real_dev);
+               if (domain->type != IOMMU_DOMAIN_DMA) {
+                       dmar_remove_one_dev_info(real_dev);
+
+                       ret = intel_iommu_request_dma_domain_for_dev(real_dev, dmar_domain);
+                       if (ret)
+                               goto unlink;
+               }
+       }
+
        if (domain->type == IOMMU_DOMAIN_DMA) {
                if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
                        ret = iommu_request_dm_for_dev(dev);
@@ -5822,20 +5860,12 @@ static int intel_iommu_add_device(struct device *dev)
                }
        } else {
                if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
-                       ret = iommu_request_dma_domain_for_dev(dev);
-                       if (ret) {
-                               dmar_remove_one_dev_info(dev);
-                               dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
-                               if (!get_private_domain_for_dev(dev)) {
-                                       dev_warn(dev,
-                                                "Failed to get a private domain.\n");
-                                       ret = -ENOMEM;
-                                       goto unlink;
-                               }
+                       ret = intel_iommu_request_dma_domain_for_dev(dev, dmar_domain);
+                       if (ret)
+                               goto unlink;
 
-                               dev_info(dev,
-                                        "Device uses a private dma domain.\n");
-                       }
+                       dev_info(dev,
+                                "Device uses a private dma domain.\n");
                }
        }
 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-04-09  0:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-19  3:21 [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed Daniel Drake
2020-02-19  3:40 ` Lu Baolu
2020-02-20  3:36   ` Daniel Drake
2020-02-20 10:06     ` Daniel Drake
2020-02-20 11:58       ` Lu Baolu
2020-02-27 18:19         ` Derrick, Jonathan
2020-04-09  0:16       ` Derrick, Jonathan
2020-04-09  0:33         ` Derrick, Jonathan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).