All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Auger Eric <eric.auger@redhat.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Joerg Roedel <joro@8bytes.org>,
	"David Woodhouse" <dwmw2@infradead.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jonathan Cameron <jic23@kernel.org>,
	jacob.jun.pan@linux.intel.com
Subject: Re: [PATCH V10 08/11] iommu/vt-d: Add svm/sva invalidate function
Date: Tue, 31 Mar 2020 14:07:40 -0700	[thread overview]
Message-ID: <20200331140740.36505c11@jacob-builder> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D19D800E75@SHSMSX104.ccr.corp.intel.com>

On Tue, 31 Mar 2020 03:34:22 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Auger Eric <eric.auger@redhat.com>
> > Sent: Monday, March 30, 2020 12:05 AM
> > 
> > On 3/28/20 11:01 AM, Tian, Kevin wrote:  
> > >> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > >> Sent: Saturday, March 21, 2020 7:28 AM
> > >>
> > >> When Shared Virtual Address (SVA) is enabled for a guest OS via
> > >> vIOMMU, we need to provide invalidation support at IOMMU API
> > >> and  
> > driver  
> > >> level. This patch adds Intel VT-d specific function to implement
> > >> iommu passdown invalidate API for shared virtual address.
> > >>
> > >> The use case is for supporting caching structure invalidation
> > >> of assigned SVM capable devices. Emulated IOMMU exposes queue  
> > >
> > > emulated IOMMU -> vIOMMU, since virito-iommu could use the
> > > interface as well.
> > >  
> > >> invalidation capability and passes down all descriptors from the
> > >> guest to the physical IOMMU.
> > >>
> > >> The assumption is that guest to host device ID mapping should be
> > >> resolved prior to calling IOMMU driver. Based on the device
> > >> handle, host IOMMU driver can replace certain fields before
> > >> submit to the invalidation queue.
> > >>
> > >> ---
> > >> v7 review fixed in v10
> > >> ---
> > >>
> > >> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > >> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> > >> Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
> > >> ---
> > >>  drivers/iommu/intel-iommu.c | 182
> > >> ++++++++++++++++++++++++++++++++++++++++++++
> > >>  1 file changed, 182 insertions(+)
> > >>
> > >> diff --git a/drivers/iommu/intel-iommu.c
> > >> b/drivers/iommu/intel-iommu.c index b1477cd423dd..a76afb0fd51a
> > >> 100644 --- a/drivers/iommu/intel-iommu.c
> > >> +++ b/drivers/iommu/intel-iommu.c
> > >> @@ -5619,6 +5619,187 @@ static void
> > >> intel_iommu_aux_detach_device(struct iommu_domain *domain,
> > >>  	aux_domain_remove_dev(to_dmar_domain(domain), dev);
> > >>  }
> > >>
> > >> +/*
> > >> + * 2D array for converting and sanitizing IOMMU generic TLB
> > >> granularity  
> > to  
> > >> + * VT-d granularity. Invalidation is typically included in the
> > >> unmap  
> > operation  
> > >> + * as a result of DMA or VFIO unmap. However, for assigned
> > >> devices  
> > guest  
> > >> + * owns the first level page tables. Invalidations of
> > >> translation caches in  
> > the  
> > >> + * guest are trapped and passed down to the host.
> > >> + *
> > >> + * vIOMMU in the guest will only expose first level page
> > >> tables, therefore
> > >> + * we do not include IOTLB granularity for request without
> > >> PASID (second level).  
> > >
> > > I would revise above as "We do not support IOTLB granularity for
> > > request without PASID (second level), therefore any vIOMMU
> > > implementation that exposes the SVA capability to the guest
> > > should only expose the first level page tables, implying all
> > > invalidation requests from the guest will include a valid PASID"
> > >  
> > >> + *
> > >> + * For example, to find the VT-d granularity encoding for IOTLB
> > >> + * type and page selective granularity within PASID:
> > >> + * X: indexed by iommu cache type
> > >> + * Y: indexed by enum iommu_inv_granularity
> > >> + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR]
> > >> + *
> > >> + * Granu_map array indicates validity of the table. 1: valid,
> > >> 0: invalid
> > >> + *
> > >> + */
> > >> +const static int
> > >>  
> > inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_  
> > >> NR] = {
> > >> +	/*
> > >> +	 * PASID based IOTLB invalidation: PASID selective (per
> > >> PASID),
> > >> +	 * page selective (address granularity)
> > >> +	 */
> > >> +	{0, 1, 1},
> > >> +	/* PASID based dev TLBs, only support all PASIDs or
> > >> single PASID */
> > >> +	{1, 1, 0},  
> > >
> > > Is this combination correct? when single PASID is being
> > > specified, it is essentially a page-selective invalidation since
> > > you need provide Address and Size.
> > >  
> > >> +	/* PASID cache */  
> > >
> > > PASID cache is fully managed by the host. Guest PASID cache
> > > invalidation is interpreted by vIOMMU for bind and unbind
> > > operations. I don't think we should accept any PASID cache
> > > invalidation from userspace or guest.  
> > I tend to agree here.  
> > >  
> > >> +	{1, 1, 0}
> > >> +};
> > >> +
> > >> +const static int
> > >>  
> > inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU  
> > >> _NR] = {
> > >> +	/* PASID based IOTLB */
> > >> +	{0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID},
> > >> +	/* PASID based dev TLBs */
> > >> +	{QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0},
> > >> +	/* PASID cache */
> > >> +	{QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0},
> > >> +};
> > >> +
> > >> +static inline int to_vtd_granularity(int type, int granu, int
> > >> *vtd_granu) +{
> > >> +	if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >=
> > >> IOMMU_INV_GRANU_NR ||
> > >> +		!inv_type_granu_map[type][granu])
> > >> +		return -EINVAL;
> > >> +
> > >> +	*vtd_granu = inv_type_granu_table[type][granu];
> > >> +  
> > >
> > > btw do we really need both map and table here? Can't we just
> > > use one table with unsupported granularity marked as a special
> > > value?  
> > I asked the same question some time ago. If I remember correctly the
> > issue is while a granu can be supported in inv_type_granu_map, the
> > associated value in inv_type_granu_table can be 0. This typically
> > matches both values of G field (0 or 1) in the invalidation cmd. See
> > other comment below.  
> 
> I didn't fully understand it. Also what does a value '0' imply? also
> it's interesting to see below in [PATCH 07/11]:
> 
0 in 2D map array means invalid.
0 in granu table can be either valid or invalid
That is why we need the map table to tell the difference.
I will add following comments since this causes lots of confusion.

 * Granu_map array indicates validity of the table. 1: valid, 0: invalid
 * This is useful when the entry in the granu table has a value of 0,
 * which can be a valid or invalid value.


> +/* QI Dev-IOTLB inv granu */
> +#define QI_DEV_IOTLB_GRAN_ALL		1
> +#define QI_DEV_IOTLB_GRAN_PASID_SEL	0
> +
> 
Sorry I didn't get the point? These are the valid vt-d granu values.
Per Spec CH 6.5.2.6

> > >  
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules)
> > >> +{
> > >> +	u64 nr_pages = (granu_size * nr_granules) >>
> > >> VTD_PAGE_SHIFT; +
> > >> +	/* VT-d size is encoded as 2^size of 4K pages, 0 for
> > >> 4k, 9 for 2MB, etc.
> > >> +	 * IOMMU cache invalidate API passes granu_size in
> > >> bytes, and number of
> > >> +	 * granu size in contiguous memory.
> > >> +	 */
> > >> +	return order_base_2(nr_pages);
> > >> +}
> > >> +
> > >> +#ifdef CONFIG_INTEL_IOMMU_SVM
> > >> +static int intel_iommu_sva_invalidate(struct iommu_domain
> > >> *domain,
> > >> +		struct device *dev, struct
> > >> iommu_cache_invalidate_info *inv_info)
> > >> +{
> > >> +	struct dmar_domain *dmar_domain =
> > >> to_dmar_domain(domain);
> > >> +	struct device_domain_info *info;
> > >> +	struct intel_iommu *iommu;
> > >> +	unsigned long flags;
> > >> +	int cache_type;
> > >> +	u8 bus, devfn;
> > >> +	u16 did, sid;
> > >> +	int ret = 0;
> > >> +	u64 size = 0;
> > >> +
> > >> +	if (!inv_info || !dmar_domain ||
> > >> +		inv_info->version !=
> > >> IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
> > >> +		return -EINVAL;
> > >> +
> > >> +	if (!dev || !dev_is_pci(dev))
> > >> +		return -ENODEV;
> > >> +
> > >> +	iommu = device_to_iommu(dev, &bus, &devfn);
> > >> +	if (!iommu)
> > >> +		return -ENODEV;
> > >> +
> > >> +	spin_lock_irqsave(&device_domain_lock, flags);
> > >> +	spin_lock(&iommu->lock);
> > >> +	info = iommu_support_dev_iotlb(dmar_domain, iommu, bus,
> > >> devfn);
> > >> +	if (!info) {
> > >> +		ret = -EINVAL;
> > >> +		goto out_unlock;  
> > >
> > > -ENOTSUPP?
> > >  
> > >> +	}
> > >> +	did = dmar_domain->iommu_did[iommu->seq_id];
> > >> +	sid = PCI_DEVID(bus, devfn);
> > >> +
> > >> +	/* Size is only valid in non-PASID selective
> > >> invalidation */
> > >> +	if (inv_info->granularity != IOMMU_INV_GRANU_PASID)
> > >> +		size =
> > >> to_vtd_size(inv_info->addr_info.granule_size,
> > >> +
> > >> inv_info->addr_info.nb_granules); +
> > >> +	for_each_set_bit(cache_type, (unsigned long
> > >> *)&inv_info->cache, IOMMU_CACHE_INV_TYPE_NR) {
> > >> +		int granu = 0;
> > >> +		u64 pasid = 0;
> > >> +
> > >> +		ret = to_vtd_granularity(cache_type,
> > >> inv_info->granularity, &granu);
> > >> +		if (ret) {
> > >> +			pr_err("Invalid cache type and granu
> > >> combination %d/%d\n", cache_type,
> > >> +				inv_info->granularity);
> > >> +			break;
> > >> +		}
> > >> +
> > >> +		/* PASID is stored in different locations based
> > >> on granularity */
> > >> +		if (inv_info->granularity ==
> > >> IOMMU_INV_GRANU_PASID &&
> > >> +			inv_info->pasid_info.flags &
> > >> IOMMU_INV_PASID_FLAGS_PASID)
> > >> +			pasid = inv_info->pasid_info.pasid;
> > >> +		else if (inv_info->granularity ==
> > >> IOMMU_INV_GRANU_ADDR &&
> > >> +			inv_info->addr_info.flags &
> > >> IOMMU_INV_ADDR_FLAGS_PASID)
> > >> +			pasid = inv_info->addr_info.pasid;
> > >> +		else {
> > >> +			pr_err("Cannot find PASID for given
> > >> cache type and granularity\n");
> > >> +			break;
> > >> +		}
> > >> +
> > >> +		switch (BIT(cache_type)) {
> > >> +		case IOMMU_CACHE_INV_TYPE_IOTLB:
> > >> +			if ((inv_info->granularity !=
> > >> IOMMU_INV_GRANU_PASID) &&  
> > >
> > > granularity == IOMMU_INV_GRANU_ADDR? otherwise it's unclear
> > > why IOMMU_INV_GRANU_DOMAIN also needs size check.
> > >  
> > >> +				size &&
> > >> (inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + size)) - 1)))
> > >> {
> > >> +				pr_err("Address out of range,
> > >> 0x%llx, size order %llu\n",
> > >> +
> > >> inv_info->addr_info.addr, size);
> > >> +				ret = -ERANGE;
> > >> +				goto out_unlock;
> > >> +			}
> > >> +
> > >> +			qi_flush_piotlb(iommu, did,
> > >> +					pasid,
> > >> +
> > >> mm_to_dma_pfn(inv_info-  
> > >>> addr_info.addr),  
> > >> +					(granu ==
> > >> QI_GRAN_NONG_PASID) ? - 1 : 1 << size,
> > >> +
> > >> inv_info->addr_info.flags & IOMMU_INV_ADDR_FLAGS_LEAF);
> > >> +
> > >> +			/*
> > >> +			 * Always flush device IOTLB if ATS is
> > >> enabled since guest
> > >> +			 * vIOMMU exposes CM = 1, no device
> > >> IOTLB flush will be passed
> > >> +			 * down.
> > >> +			 */  
> > >
> > > Does VT-d spec mention that no device IOTLB flush is required
> > > when CM=1? 
> > >> +			if (info->ats_enabled) {
> > >> +				qi_flush_dev_iotlb_pasid(iommu,
> > >> sid, info-  
> > >>> pfsid,  
> > >> +						pasid,
> > >> info->ats_qdep,
> > >> +
> > >> inv_info->addr_info.addr, size,
> > >> +						granu);
> > >> +			}
> > >> +			break;
> > >> +		case IOMMU_CACHE_INV_TYPE_DEV_IOTLB:
> > >> +			if (info->ats_enabled) {
> > >> +				qi_flush_dev_iotlb_pasid(iommu,
> > >> sid, info-  
> > >>> pfsid,  
> > >> +
> > >> inv_info->addr_info.pasid, info->ats_qdep,
> > >> +
> > >> inv_info->addr_info.addr, size,
> > >> +						granu);  
> > >
> > > I'm confused here. There are two granularities allowed for
> > > devtlb, but here you only handle one of them?  
> > granu is the result of to_vtd_granularity() so it can take either
> > of the 2 values.  
> 
> yes, you're right. 
> 
> > 
> > Thanks
> > 
> > Eric  
> > >  
> > >> +			} else
> > >> +				pr_warn("Passdown device IOTLB
> > >> flush w/o ATS!\n");
> > >> +
> > >> +			break;
> > >> +		case IOMMU_CACHE_INV_TYPE_PASID:
> > >> +			qi_flush_pasid_cache(iommu, did, granu,
> > >> inv_info-  
> > >>> pasid_info.pasid);  
> > >> +  
> > >
> > > as earlier comment, we shouldn't allow userspace or guest to
> > > invalidate PASID cache
> > >  
> > >> +			break;
> > >> +		default:
> > >> +			dev_err(dev, "Unsupported IOMMU
> > >> invalidation type %d\n",
> > >> +				cache_type);
> > >> +			ret = -EINVAL;
> > >> +		}
> > >> +	}
> > >> +out_unlock:
> > >> +	spin_unlock(&iommu->lock);
> > >> +	spin_unlock_irqrestore(&device_domain_lock, flags);
> > >> +
> > >> +	return ret;
> > >> +}
> > >> +#endif
> > >> +
> > >>  static int intel_iommu_map(struct iommu_domain *domain,
> > >>  			   unsigned long iova, phys_addr_t hpa,
> > >>  			   size_t size, int iommu_prot, gfp_t
> > >> gfp) @@ -6204,6 +6385,7 @@ const struct iommu_ops
> > >> intel_iommu_ops = { .is_attach_deferred	=
> > >> intel_iommu_is_attach_deferred, .pgsize_bitmap		=
> > >> INTEL_IOMMU_PGSIZES, #ifdef CONFIG_INTEL_IOMMU_SVM
> > >> +	.cache_invalidate	= intel_iommu_sva_invalidate,
> > >>  	.sva_bind_gpasid	= intel_svm_bind_gpasid,
> > >>  	.sva_unbind_gpasid	= intel_svm_unbind_gpasid,
> > >>  #endif
> > >> --
> > >> 2.7.4  
> > >  
> 

[Jacob Pan]

WARNING: multiple messages have this Message-ID (diff)
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	David Woodhouse <dwmw2@infradead.org>,
	Jonathan Cameron <jic23@kernel.org>
Subject: Re: [PATCH V10 08/11] iommu/vt-d: Add svm/sva invalidate function
Date: Tue, 31 Mar 2020 14:07:40 -0700	[thread overview]
Message-ID: <20200331140740.36505c11@jacob-builder> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D19D800E75@SHSMSX104.ccr.corp.intel.com>

On Tue, 31 Mar 2020 03:34:22 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Auger Eric <eric.auger@redhat.com>
> > Sent: Monday, March 30, 2020 12:05 AM
> > 
> > On 3/28/20 11:01 AM, Tian, Kevin wrote:  
> > >> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > >> Sent: Saturday, March 21, 2020 7:28 AM
> > >>
> > >> When Shared Virtual Address (SVA) is enabled for a guest OS via
> > >> vIOMMU, we need to provide invalidation support at IOMMU API
> > >> and  
> > driver  
> > >> level. This patch adds Intel VT-d specific function to implement
> > >> iommu passdown invalidate API for shared virtual address.
> > >>
> > >> The use case is for supporting caching structure invalidation
> > >> of assigned SVM capable devices. Emulated IOMMU exposes queue  
> > >
> > > emulated IOMMU -> vIOMMU, since virito-iommu could use the
> > > interface as well.
> > >  
> > >> invalidation capability and passes down all descriptors from the
> > >> guest to the physical IOMMU.
> > >>
> > >> The assumption is that guest to host device ID mapping should be
> > >> resolved prior to calling IOMMU driver. Based on the device
> > >> handle, host IOMMU driver can replace certain fields before
> > >> submit to the invalidation queue.
> > >>
> > >> ---
> > >> v7 review fixed in v10
> > >> ---
> > >>
> > >> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > >> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> > >> Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
> > >> ---
> > >>  drivers/iommu/intel-iommu.c | 182
> > >> ++++++++++++++++++++++++++++++++++++++++++++
> > >>  1 file changed, 182 insertions(+)
> > >>
> > >> diff --git a/drivers/iommu/intel-iommu.c
> > >> b/drivers/iommu/intel-iommu.c index b1477cd423dd..a76afb0fd51a
> > >> 100644 --- a/drivers/iommu/intel-iommu.c
> > >> +++ b/drivers/iommu/intel-iommu.c
> > >> @@ -5619,6 +5619,187 @@ static void
> > >> intel_iommu_aux_detach_device(struct iommu_domain *domain,
> > >>  	aux_domain_remove_dev(to_dmar_domain(domain), dev);
> > >>  }
> > >>
> > >> +/*
> > >> + * 2D array for converting and sanitizing IOMMU generic TLB
> > >> granularity  
> > to  
> > >> + * VT-d granularity. Invalidation is typically included in the
> > >> unmap  
> > operation  
> > >> + * as a result of DMA or VFIO unmap. However, for assigned
> > >> devices  
> > guest  
> > >> + * owns the first level page tables. Invalidations of
> > >> translation caches in  
> > the  
> > >> + * guest are trapped and passed down to the host.
> > >> + *
> > >> + * vIOMMU in the guest will only expose first level page
> > >> tables, therefore
> > >> + * we do not include IOTLB granularity for request without
> > >> PASID (second level).  
> > >
> > > I would revise above as "We do not support IOTLB granularity for
> > > request without PASID (second level), therefore any vIOMMU
> > > implementation that exposes the SVA capability to the guest
> > > should only expose the first level page tables, implying all
> > > invalidation requests from the guest will include a valid PASID"
> > >  
> > >> + *
> > >> + * For example, to find the VT-d granularity encoding for IOTLB
> > >> + * type and page selective granularity within PASID:
> > >> + * X: indexed by iommu cache type
> > >> + * Y: indexed by enum iommu_inv_granularity
> > >> + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR]
> > >> + *
> > >> + * Granu_map array indicates validity of the table. 1: valid,
> > >> 0: invalid
> > >> + *
> > >> + */
> > >> +const static int
> > >>  
> > inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_  
> > >> NR] = {
> > >> +	/*
> > >> +	 * PASID based IOTLB invalidation: PASID selective (per
> > >> PASID),
> > >> +	 * page selective (address granularity)
> > >> +	 */
> > >> +	{0, 1, 1},
> > >> +	/* PASID based dev TLBs, only support all PASIDs or
> > >> single PASID */
> > >> +	{1, 1, 0},  
> > >
> > > Is this combination correct? when single PASID is being
> > > specified, it is essentially a page-selective invalidation since
> > > you need provide Address and Size.
> > >  
> > >> +	/* PASID cache */  
> > >
> > > PASID cache is fully managed by the host. Guest PASID cache
> > > invalidation is interpreted by vIOMMU for bind and unbind
> > > operations. I don't think we should accept any PASID cache
> > > invalidation from userspace or guest.  
> > I tend to agree here.  
> > >  
> > >> +	{1, 1, 0}
> > >> +};
> > >> +
> > >> +const static int
> > >>  
> > inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU  
> > >> _NR] = {
> > >> +	/* PASID based IOTLB */
> > >> +	{0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID},
> > >> +	/* PASID based dev TLBs */
> > >> +	{QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0},
> > >> +	/* PASID cache */
> > >> +	{QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0},
> > >> +};
> > >> +
> > >> +static inline int to_vtd_granularity(int type, int granu, int
> > >> *vtd_granu) +{
> > >> +	if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >=
> > >> IOMMU_INV_GRANU_NR ||
> > >> +		!inv_type_granu_map[type][granu])
> > >> +		return -EINVAL;
> > >> +
> > >> +	*vtd_granu = inv_type_granu_table[type][granu];
> > >> +  
> > >
> > > btw do we really need both map and table here? Can't we just
> > > use one table with unsupported granularity marked as a special
> > > value?  
> > I asked the same question some time ago. If I remember correctly the
> > issue is while a granu can be supported in inv_type_granu_map, the
> > associated value in inv_type_granu_table can be 0. This typically
> > matches both values of G field (0 or 1) in the invalidation cmd. See
> > other comment below.  
> 
> I didn't fully understand it. Also what does a value '0' imply? also
> it's interesting to see below in [PATCH 07/11]:
> 
0 in 2D map array means invalid.
0 in granu table can be either valid or invalid
That is why we need the map table to tell the difference.
I will add following comments since this causes lots of confusion.

 * Granu_map array indicates validity of the table. 1: valid, 0: invalid
 * This is useful when the entry in the granu table has a value of 0,
 * which can be a valid or invalid value.


> +/* QI Dev-IOTLB inv granu */
> +#define QI_DEV_IOTLB_GRAN_ALL		1
> +#define QI_DEV_IOTLB_GRAN_PASID_SEL	0
> +
> 
Sorry I didn't get the point? These are the valid vt-d granu values.
Per Spec CH 6.5.2.6

> > >  
> > >> +	return 0;
> > >> +}
> > >> +
> > >> +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules)
> > >> +{
> > >> +	u64 nr_pages = (granu_size * nr_granules) >>
> > >> VTD_PAGE_SHIFT; +
> > >> +	/* VT-d size is encoded as 2^size of 4K pages, 0 for
> > >> 4k, 9 for 2MB, etc.
> > >> +	 * IOMMU cache invalidate API passes granu_size in
> > >> bytes, and number of
> > >> +	 * granu size in contiguous memory.
> > >> +	 */
> > >> +	return order_base_2(nr_pages);
> > >> +}
> > >> +
> > >> +#ifdef CONFIG_INTEL_IOMMU_SVM
> > >> +static int intel_iommu_sva_invalidate(struct iommu_domain
> > >> *domain,
> > >> +		struct device *dev, struct
> > >> iommu_cache_invalidate_info *inv_info)
> > >> +{
> > >> +	struct dmar_domain *dmar_domain =
> > >> to_dmar_domain(domain);
> > >> +	struct device_domain_info *info;
> > >> +	struct intel_iommu *iommu;
> > >> +	unsigned long flags;
> > >> +	int cache_type;
> > >> +	u8 bus, devfn;
> > >> +	u16 did, sid;
> > >> +	int ret = 0;
> > >> +	u64 size = 0;
> > >> +
> > >> +	if (!inv_info || !dmar_domain ||
> > >> +		inv_info->version !=
> > >> IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
> > >> +		return -EINVAL;
> > >> +
> > >> +	if (!dev || !dev_is_pci(dev))
> > >> +		return -ENODEV;
> > >> +
> > >> +	iommu = device_to_iommu(dev, &bus, &devfn);
> > >> +	if (!iommu)
> > >> +		return -ENODEV;
> > >> +
> > >> +	spin_lock_irqsave(&device_domain_lock, flags);
> > >> +	spin_lock(&iommu->lock);
> > >> +	info = iommu_support_dev_iotlb(dmar_domain, iommu, bus,
> > >> devfn);
> > >> +	if (!info) {
> > >> +		ret = -EINVAL;
> > >> +		goto out_unlock;  
> > >
> > > -ENOTSUPP?
> > >  
> > >> +	}
> > >> +	did = dmar_domain->iommu_did[iommu->seq_id];
> > >> +	sid = PCI_DEVID(bus, devfn);
> > >> +
> > >> +	/* Size is only valid in non-PASID selective
> > >> invalidation */
> > >> +	if (inv_info->granularity != IOMMU_INV_GRANU_PASID)
> > >> +		size =
> > >> to_vtd_size(inv_info->addr_info.granule_size,
> > >> +
> > >> inv_info->addr_info.nb_granules); +
> > >> +	for_each_set_bit(cache_type, (unsigned long
> > >> *)&inv_info->cache, IOMMU_CACHE_INV_TYPE_NR) {
> > >> +		int granu = 0;
> > >> +		u64 pasid = 0;
> > >> +
> > >> +		ret = to_vtd_granularity(cache_type,
> > >> inv_info->granularity, &granu);
> > >> +		if (ret) {
> > >> +			pr_err("Invalid cache type and granu
> > >> combination %d/%d\n", cache_type,
> > >> +				inv_info->granularity);
> > >> +			break;
> > >> +		}
> > >> +
> > >> +		/* PASID is stored in different locations based
> > >> on granularity */
> > >> +		if (inv_info->granularity ==
> > >> IOMMU_INV_GRANU_PASID &&
> > >> +			inv_info->pasid_info.flags &
> > >> IOMMU_INV_PASID_FLAGS_PASID)
> > >> +			pasid = inv_info->pasid_info.pasid;
> > >> +		else if (inv_info->granularity ==
> > >> IOMMU_INV_GRANU_ADDR &&
> > >> +			inv_info->addr_info.flags &
> > >> IOMMU_INV_ADDR_FLAGS_PASID)
> > >> +			pasid = inv_info->addr_info.pasid;
> > >> +		else {
> > >> +			pr_err("Cannot find PASID for given
> > >> cache type and granularity\n");
> > >> +			break;
> > >> +		}
> > >> +
> > >> +		switch (BIT(cache_type)) {
> > >> +		case IOMMU_CACHE_INV_TYPE_IOTLB:
> > >> +			if ((inv_info->granularity !=
> > >> IOMMU_INV_GRANU_PASID) &&  
> > >
> > > granularity == IOMMU_INV_GRANU_ADDR? otherwise it's unclear
> > > why IOMMU_INV_GRANU_DOMAIN also needs size check.
> > >  
> > >> +				size &&
> > >> (inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + size)) - 1)))
> > >> {
> > >> +				pr_err("Address out of range,
> > >> 0x%llx, size order %llu\n",
> > >> +
> > >> inv_info->addr_info.addr, size);
> > >> +				ret = -ERANGE;
> > >> +				goto out_unlock;
> > >> +			}
> > >> +
> > >> +			qi_flush_piotlb(iommu, did,
> > >> +					pasid,
> > >> +
> > >> mm_to_dma_pfn(inv_info-  
> > >>> addr_info.addr),  
> > >> +					(granu ==
> > >> QI_GRAN_NONG_PASID) ? - 1 : 1 << size,
> > >> +
> > >> inv_info->addr_info.flags & IOMMU_INV_ADDR_FLAGS_LEAF);
> > >> +
> > >> +			/*
> > >> +			 * Always flush device IOTLB if ATS is
> > >> enabled since guest
> > >> +			 * vIOMMU exposes CM = 1, no device
> > >> IOTLB flush will be passed
> > >> +			 * down.
> > >> +			 */  
> > >
> > > Does VT-d spec mention that no device IOTLB flush is required
> > > when CM=1? 
> > >> +			if (info->ats_enabled) {
> > >> +				qi_flush_dev_iotlb_pasid(iommu,
> > >> sid, info-  
> > >>> pfsid,  
> > >> +						pasid,
> > >> info->ats_qdep,
> > >> +
> > >> inv_info->addr_info.addr, size,
> > >> +						granu);
> > >> +			}
> > >> +			break;
> > >> +		case IOMMU_CACHE_INV_TYPE_DEV_IOTLB:
> > >> +			if (info->ats_enabled) {
> > >> +				qi_flush_dev_iotlb_pasid(iommu,
> > >> sid, info-  
> > >>> pfsid,  
> > >> +
> > >> inv_info->addr_info.pasid, info->ats_qdep,
> > >> +
> > >> inv_info->addr_info.addr, size,
> > >> +						granu);  
> > >
> > > I'm confused here. There are two granularities allowed for
> > > devtlb, but here you only handle one of them?  
> > granu is the result of to_vtd_granularity() so it can take either
> > of the 2 values.  
> 
> yes, you're right. 
> 
> > 
> > Thanks
> > 
> > Eric  
> > >  
> > >> +			} else
> > >> +				pr_warn("Passdown device IOTLB
> > >> flush w/o ATS!\n");
> > >> +
> > >> +			break;
> > >> +		case IOMMU_CACHE_INV_TYPE_PASID:
> > >> +			qi_flush_pasid_cache(iommu, did, granu,
> > >> inv_info-  
> > >>> pasid_info.pasid);  
> > >> +  
> > >
> > > as earlier comment, we shouldn't allow userspace or guest to
> > > invalidate PASID cache
> > >  
> > >> +			break;
> > >> +		default:
> > >> +			dev_err(dev, "Unsupported IOMMU
> > >> invalidation type %d\n",
> > >> +				cache_type);
> > >> +			ret = -EINVAL;
> > >> +		}
> > >> +	}
> > >> +out_unlock:
> > >> +	spin_unlock(&iommu->lock);
> > >> +	spin_unlock_irqrestore(&device_domain_lock, flags);
> > >> +
> > >> +	return ret;
> > >> +}
> > >> +#endif
> > >> +
> > >>  static int intel_iommu_map(struct iommu_domain *domain,
> > >>  			   unsigned long iova, phys_addr_t hpa,
> > >>  			   size_t size, int iommu_prot, gfp_t
> > >> gfp) @@ -6204,6 +6385,7 @@ const struct iommu_ops
> > >> intel_iommu_ops = { .is_attach_deferred	=
> > >> intel_iommu_is_attach_deferred, .pgsize_bitmap		=
> > >> INTEL_IOMMU_PGSIZES, #ifdef CONFIG_INTEL_IOMMU_SVM
> > >> +	.cache_invalidate	= intel_iommu_sva_invalidate,
> > >>  	.sva_bind_gpasid	= intel_svm_bind_gpasid,
> > >>  	.sva_unbind_gpasid	= intel_svm_unbind_gpasid,
> > >>  #endif
> > >> --
> > >> 2.7.4  
> > >  
> 

[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2020-03-31 21:01 UTC|newest]

Thread overview: 135+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-20 23:27 [PATCH V10 00/11] Nested Shared Virtual Address (SVA) VT-d support Jacob Pan
2020-03-20 23:27 ` Jacob Pan
2020-03-20 23:27 ` [PATCH V10 01/11] iommu/vt-d: Move domain helper to header Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-27 11:48   ` Tian, Kevin
2020-03-27 11:48     ` Tian, Kevin
2020-03-20 23:27 ` [PATCH V10 02/11] iommu/uapi: Define a mask for bind data Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-22  1:29   ` Lu Baolu
2020-03-22  1:29     ` Lu Baolu
2020-03-23 19:37     ` Jacob Pan
2020-03-23 19:37       ` Jacob Pan
2020-03-24  1:50       ` Lu Baolu
2020-03-24  1:50         ` Lu Baolu
2020-03-27 11:50   ` Tian, Kevin
2020-03-27 11:50     ` Tian, Kevin
2020-03-27 14:13   ` Auger Eric
2020-03-27 14:13     ` Auger Eric
2020-03-20 23:27 ` [PATCH V10 03/11] iommu/vt-d: Add a helper function to skip agaw Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-27 11:53   ` Tian, Kevin
2020-03-27 11:53     ` Tian, Kevin
2020-03-29  7:20     ` Lu Baolu
2020-03-29  7:20       ` Lu Baolu
2020-03-30 17:50       ` Jacob Pan
2020-03-30 17:50         ` Jacob Pan
2020-03-20 23:27 ` [PATCH V10 04/11] iommu/vt-d: Use helper function to skip agaw for SL Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-27 11:55   ` Tian, Kevin
2020-03-27 11:55     ` Tian, Kevin
2020-03-27 16:05     ` Auger Eric
2020-03-27 16:05       ` Auger Eric
2020-03-29  7:35       ` Lu Baolu
2020-03-29  7:35         ` Lu Baolu
2020-03-20 23:27 ` [PATCH V10 05/11] iommu/vt-d: Add nested translation helper function Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-26 10:41   ` kbuild test robot
2020-03-27 12:21   ` Tian, Kevin
2020-03-27 12:21     ` Tian, Kevin
2020-03-29  8:03     ` Lu Baolu
2020-03-29  8:03       ` Lu Baolu
2020-03-30 18:21       ` Jacob Pan
2020-03-30 18:21         ` Jacob Pan
2020-03-31  3:36         ` Tian, Kevin
2020-03-31  3:36           ` Tian, Kevin
2020-03-29 11:35   ` Auger Eric
2020-03-29 11:35     ` Auger Eric
2020-04-01 20:06     ` Jacob Pan
2020-04-01 20:06       ` Jacob Pan
2020-03-20 23:27 ` [PATCH V10 06/11] iommu/vt-d: Add bind guest PASID support Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-28  8:02   ` Tian, Kevin
2020-03-28  8:02     ` Tian, Kevin
2020-03-30 20:51     ` Jacob Pan
2020-03-30 20:51       ` Jacob Pan
2020-03-31  3:43       ` Tian, Kevin
2020-03-31  3:43         ` Tian, Kevin
2020-04-01 17:13         ` Jacob Pan
2020-04-01 17:13           ` Jacob Pan
2020-03-29 13:40   ` Auger Eric
2020-03-29 13:40     ` Auger Eric
2020-03-30 22:53     ` Jacob Pan
2020-03-30 22:53       ` Jacob Pan
2020-03-20 23:27 ` [PATCH V10 07/11] iommu/vt-d: Support flushing more translation cache types Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-27 14:46   ` Auger Eric
2020-03-27 14:46     ` Auger Eric
2020-03-30 23:28     ` Jacob Pan
2020-03-30 23:28       ` Jacob Pan
2020-03-31 16:13       ` Jacob Pan
2020-03-31 16:13         ` Jacob Pan
2020-03-31 16:15         ` Auger Eric
2020-03-31 16:15           ` Auger Eric
2020-03-20 23:27 ` [PATCH V10 08/11] iommu/vt-d: Add svm/sva invalidate function Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-28 10:01   ` Tian, Kevin
2020-03-28 10:01     ` Tian, Kevin
2020-03-29 15:34     ` Auger Eric
2020-03-29 15:34       ` Auger Eric
2020-03-31  2:49       ` Tian, Kevin
2020-03-31  2:49         ` Tian, Kevin
2020-03-31 20:58         ` Jacob Pan
2020-03-31 20:58           ` Jacob Pan
2020-04-01  6:29           ` Tian, Kevin
2020-04-01  6:29             ` Tian, Kevin
2020-04-01  7:13             ` Liu, Yi L
2020-04-01  7:13               ` Liu, Yi L
2020-04-01  7:32               ` Auger Eric
2020-04-01  7:32                 ` Auger Eric
2020-04-01 16:05                 ` Jacob Pan
2020-04-01 16:05                   ` Jacob Pan
2020-04-02 15:54                 ` Jacob Pan
2020-04-02 15:54                   ` Jacob Pan
2020-03-29 16:05     ` Auger Eric
2020-03-29 16:05       ` Auger Eric
2020-03-31  3:34       ` Tian, Kevin
2020-03-31  3:34         ` Tian, Kevin
2020-03-31 21:07         ` Jacob Pan [this message]
2020-03-31 21:07           ` Jacob Pan
2020-04-01  6:32           ` Tian, Kevin
2020-04-01  6:32             ` Tian, Kevin
2020-03-31 18:13     ` Jacob Pan
2020-03-31 18:13       ` Jacob Pan
2020-04-01  6:24       ` Tian, Kevin
2020-04-01  6:24         ` Tian, Kevin
2020-04-01  6:57         ` Liu, Yi L
2020-04-01  6:57           ` Liu, Yi L
2020-04-01 16:03           ` Jacob Pan
2020-04-01 16:03             ` Jacob Pan
2020-03-29 16:05   ` Auger Eric
2020-03-29 16:05     ` Auger Eric
2020-03-31 22:28     ` Jacob Pan
2020-03-31 22:28       ` Jacob Pan
2020-03-20 23:27 ` [PATCH V10 09/11] iommu/vt-d: Cache virtual command capability register Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-28 10:04   ` Tian, Kevin
2020-03-28 10:04     ` Tian, Kevin
2020-03-31 22:33     ` Jacob Pan
2020-03-31 22:33       ` Jacob Pan
2020-03-20 23:27 ` [PATCH V10 10/11] iommu/vt-d: Enlightened PASID allocation Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-28 10:08   ` Tian, Kevin
2020-03-28 10:08     ` Tian, Kevin
2020-03-31 22:37     ` Jacob Pan
2020-03-31 22:37       ` Jacob Pan
2020-03-20 23:27 ` [PATCH V10 11/11] iommu/vt-d: Add custom allocator for IOASID Jacob Pan
2020-03-20 23:27   ` Jacob Pan
2020-03-28 10:22   ` Tian, Kevin
2020-03-28 10:22     ` Tian, Kevin
2020-04-01 15:47     ` Jacob Pan
2020-04-01 15:47       ` Jacob Pan
2020-04-02  2:18       ` Tian, Kevin
2020-04-02  2:18         ` Tian, Kevin
2020-04-02 20:28         ` Jacob Pan
2020-04-02 20:28           ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200331140740.36505c11@jacob-builder \
    --to=jacob.jun.pan@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.com \
    --cc=jic23@kernel.org \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.