iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: "Liu, Yi L" <yi.l.liu@intel.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>
Cc: "cohuck@redhat.com" <cohuck@redhat.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"yi.y.sun@linux.intel.com" <yi.y.sun@linux.intel.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"shameerali.kolothum.thodi@huawei.com"
	<shameerali.kolothum.thodi@huawei.com>,
	"lulu@redhat.com" <lulu@redhat.com>,
	"suravee.suthikulpanit@amd.com" <suravee.suthikulpanit@amd.com>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>
Subject: RE: [PATCH v3 04/10] iommu/vt-d: Add helper to setup pasid nested translation
Date: Wed, 24 May 2023 07:16:49 +0000	[thread overview]
Message-ID: <BN9PR11MB5276A52907EDD2155D42B3C08C419@BN9PR11MB5276.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20230511145110.27707-5-yi.l.liu@intel.com>

> From: Yi Liu <yi.l.liu@intel.com>
> Sent: Thursday, May 11, 2023 10:51 PM
> 
> +
> +/**
> + * intel_pasid_setup_nested() - Set up PASID entry for nested translation.
> + * This could be used for guest shared virtual address. In this case, the
> + * first level page tables are used for GVA-GPA translation in the guest,
> + * second level page tables are used for GPA-HPA translation.

it's not just for guest SVA. Actually in this series it's RID_PASID nested
translation.

> + *
> + * @iommu:      IOMMU which the device belong to
> + * @dev:        Device to be set up for translation
> + * @pasid:      PASID to be programmed in the device PASID table
> + * @domain:     User domain nested on a s2 domain

"User stage-1 domain"

> + */
> +int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device
> *dev,
> +			     u32 pasid, struct dmar_domain *domain)
> +{
> +	struct iommu_hwpt_intel_vtd *s1_cfg = &domain->s1_cfg;
> +	pgd_t *s1_gpgd = (pgd_t *)(uintptr_t)domain->s1_pgtbl;
> +	struct dmar_domain *s2_domain = domain->s2_domain;
> +	u16 did = domain_id_iommu(domain, iommu);
> +	struct dma_pte *pgd = s2_domain->pgd;
> +	struct pasid_entry *pte;
> +	int agaw;
> +
> +	if (!ecap_nest(iommu->ecap)) {
> +		pr_err_ratelimited("%s: No nested translation support\n",
> +				   iommu->name);
> +		return -ENODEV;
> +	}
> +
> +	/*
> +	 * Sanity checking performed by caller to make sure address width

"by caller"? it's checked in this function.

> +	 * matching in two dimensions: CPU vs. IOMMU, guest vs. host.
> +	 */
> +	switch (s1_cfg->addr_width) {
> +	case ADDR_WIDTH_4LEVEL:
> +		break;
> +#ifdef CONFIG_X86
> +	case ADDR_WIDTH_5LEVEL:
> +		if (!cpu_feature_enabled(X86_FEATURE_LA57) ||
> +		    !cap_fl5lp_support(iommu->cap)) {
> +			dev_err_ratelimited(dev,
> +					    "5-level paging not supported\n");
> +			return -EINVAL;
> +		}
> +		break;
> +#endif
> +	default:
> +		dev_err_ratelimited(dev, "Invalid guest address width %d\n",
> +				    s1_cfg->addr_width);
> +		return -EINVAL;
> +	}
> +
> +	if ((s1_cfg->flags & IOMMU_VTD_PGTBL_SRE) && !ecap_srs(iommu-
> >ecap)) {
> +		pr_err_ratelimited("No supervisor request support on %s\n",
> +				   iommu->name);
> +		return -EINVAL;
> +	}
> +
> +	if ((s1_cfg->flags & IOMMU_VTD_PGTBL_EAFE)
> && !ecap_eafs(iommu->ecap)) {
> +		pr_err_ratelimited("No extended access flag support
> on %s\n",
> +				   iommu->name);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Memory type is only applicable to devices inside processor
> coherent
> +	 * domain. Will add MTS support once coherent devices are available.
> +	 */
> +	if (s1_cfg->flags & IOMMU_VTD_PGTBL_MTS_MASK) {
> +		pr_warn_ratelimited("No memory type support %s\n",
> +				    iommu->name);
> +		return -EINVAL;
> +	}

If it's unsupported why exposing them in the uAPI at this point?

> +
> +	agaw = iommu_skip_agaw(s2_domain, iommu, &pgd);
> +	if (agaw < 0) {
> +		dev_err_ratelimited(dev, "Invalid domain page table\n");
> +		return -EINVAL;
> +	}

this looks problematic.

static inline int iommu_skip_agaw(struct dmar_domain *domain,
                                  struct intel_iommu *iommu,
                                  struct dma_pte **pgd)
{
	int agaw;

	for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
		*pgd = phys_to_virt(dma_pte_addr(*pgd));
		if (!dma_pte_present(*pgd))
			return -EINVAL;
	}

	return agaw;
}

why is it safe to change pgd level of s2 domain when it's used as
the parent? this s2 pgtbl might be used by other devices behind
other iommus which already maps GPAs in a level which this
iommu doesn't support...

shouldn't we simply fail it as another incompatible condition?

> +
> +	/* First level PGD (in GPA) must be supported by the second level. */
> +	if ((uintptr_t)s1_gpgd > (1ULL << s2_domain->gaw)) {
> +		dev_err_ratelimited(dev,
> +				    "Guest PGD %lx not supported,
> max %llx\n",
> +				    (uintptr_t)s1_gpgd, s2_domain-
> >max_addr);
> +		return -EINVAL;
> +	}

I'm not sure how useful this check is. Even if the pgd is sane the
lower level PTEs could include unsupported GPA's. If a guest really
doesn't want to follow the GPA restriction which vIOMMU reports,
it can easily cause IOMMU fault in many ways.

Then why treating pgd specially?

  reply	other threads:[~2023-05-24  7:16 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-11 14:51 [PATCH v3 00/10] Add Intel VT-d nested translation Yi Liu
2023-05-11 14:51 ` [PATCH v3 01/10] iommufd: Add data structure for Intel VT-d stage-1 domain allocation Yi Liu
2023-05-24  6:59   ` Tian, Kevin
2023-05-25  2:28   ` Zhang, Tina
2023-05-29 20:00     ` Jason Gunthorpe
2023-05-29 19:53   ` Jason Gunthorpe
2023-05-11 14:51 ` [PATCH v3 02/10] iommu/vt-d: Extend dmar_domain to support nested domain Yi Liu
2023-05-24  7:02   ` Tian, Kevin
2023-05-26  2:56     ` Baolu Lu
2023-05-11 14:51 ` [PATCH v3 03/10] iommu/vt-d: Add helper for nested domain allocation Yi Liu
2023-05-11 14:51 ` [PATCH v3 04/10] iommu/vt-d: Add helper to setup pasid nested translation Yi Liu
2023-05-24  7:16   ` Tian, Kevin [this message]
2023-05-26  4:16     ` Baolu Lu
2023-06-07  8:34       ` Liu, Yi L
2023-06-08  3:32         ` Liu, Yi L
2023-06-08  3:35       ` Liu, Yi L
2023-06-08  3:37         ` Baolu Lu
2023-05-11 14:51 ` [PATCH v3 05/10] iommu/vt-d: Make domain attach helpers to be extern Yi Liu
2023-05-11 14:51 ` [PATCH v3 06/10] iommu/vt-d: Set the nested domain to a device Yi Liu
2023-05-24  7:22   ` Tian, Kevin
2023-05-26  4:24     ` Baolu Lu
2023-05-11 14:51 ` [PATCH v3 07/10] iommu/vt-d: Add iotlb flush for nested domain Yi Liu
2023-05-24  7:33   ` Tian, Kevin
2023-06-08  7:14     ` Liu, Yi L
2023-06-08  8:07       ` Baolu Lu
2023-06-20  6:22         ` Liu, Yi L
2023-05-11 14:51 ` [PATCH v3 08/10] iommu/vt-d: Add nested domain allocation Yi Liu
2023-05-11 14:51 ` [PATCH v3 09/10] iommu/vt-d: Implement hw_info for iommu capability query Yi Liu
2023-05-11 14:51 ` [PATCH v3 10/10] iommu/vt-d: Disallow nesting on domains with read-only mappings Yi Liu
2023-05-24  7:44   ` Tian, Kevin
2023-05-26  4:28     ` Baolu Lu
2023-05-24  8:59 ` [PATCH v3 00/10] Add Intel VT-d nested translation Tian, Kevin
2023-05-25 18:06   ` Alex Williamson
2023-05-26 11:25     ` Tian, Kevin
2023-05-29 18:43   ` Jason Gunthorpe
2023-05-30  0:16     ` Alex Williamson
2023-05-30 16:42       ` Jason Gunthorpe
2023-06-14  8:07     ` Tian, Kevin
2023-06-14 11:52       ` Jason Gunthorpe
2023-06-16  2:29         ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN9PR11MB5276A52907EDD2155D42B3C08C419@BN9PR11MB5276.namprd11.prod.outlook.com \
    --to=kevin.tian@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lulu@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterx@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).