All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Liu Yi L <yi.l.liu@intel.com>
Cc: eric.auger@redhat.com, baolu.lu@linux.intel.com, joro@8bytes.org,
	kevin.tian@intel.com, jacob.jun.pan@linux.intel.com,
	ashok.raj@intel.com, jun.j.tian@intel.com, yi.y.sun@intel.com,
	jean-philippe@linaro.org, peterx@redhat.com, jasowang@redhat.com,
	hao.wu@intel.com, stefanha@gmail.com,
	iommu@lists.linux-foundation.org, kvm@vger.kernel.org
Subject: Re: [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)
Date: Fri, 11 Sep 2020 15:38:06 -0600	[thread overview]
Message-ID: <20200911153806.6dda06b9@w520.home> (raw)
In-Reply-To: <1599734733-6431-8-git-send-email-yi.l.liu@intel.com>

On Thu, 10 Sep 2020 03:45:24 -0700
Liu Yi L <yi.l.liu@intel.com> wrote:

> This patch allows userspace to request PASID allocation/free, e.g. when
> serving the request from the guest.
> 
> PASIDs that are not freed by userspace are automatically freed when the
> IOASID set is destroyed when process exits.
> 
> Cc: Kevin Tian <kevin.tian@intel.com>
> CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
> v6 -> v7:
> *) current VFIO returns allocated pasid via signed int, thus VFIO UAPI
>    can only support 31 bits pasid. If user space gives min,max which is
>    wider than 31 bits, should fail the allocation or free request.
> 
> v5 -> v6:
> *) address comments from Eric against v5. remove the alloc/free helper.
> 
> v4 -> v5:
> *) address comments from Eric Auger.
> *) the comments for the PASID_FREE request is addressed in patch 5/15 of
>    this series.
> 
> v3 -> v4:
> *) address comments from v3, except the below comment against the range
>    of PASID_FREE request. needs more help on it.
>     "> +if (req.range.min > req.range.max)  
> 
>      Is it exploitable that a user can spin the kernel for a long time in
>      the case of a free by calling this with [0, MAX_UINT] regardless of
>      their actual allocations?"
>     https://lore.kernel.org/linux-iommu/20200702151832.048b44d1@x1.home/
> 
> v1 -> v2:
> *) move the vfio_mm related code to be a seprate module
> *) use a single structure for alloc/free, could support a range of PASIDs
> *) fetch vfio_mm at group_attach time instead of at iommu driver open time
> ---
>  drivers/vfio/Kconfig            |  1 +
>  drivers/vfio/vfio_iommu_type1.c | 76 +++++++++++++++++++++++++++++++++++++++++
>  drivers/vfio/vfio_pasid.c       | 10 ++++++
>  include/linux/vfio.h            |  6 ++++
>  include/uapi/linux/vfio.h       | 43 +++++++++++++++++++++++
>  5 files changed, 136 insertions(+)
> 
> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> index 3d8a108..95d90c6 100644
> --- a/drivers/vfio/Kconfig
> +++ b/drivers/vfio/Kconfig
> @@ -2,6 +2,7 @@
>  config VFIO_IOMMU_TYPE1
>  	tristate
>  	depends on VFIO
> +	select VFIO_PASID if (X86)
>  	default n
>  
>  config VFIO_IOMMU_SPAPR_TCE
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 3c0048b..bd4b668 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -76,6 +76,7 @@ struct vfio_iommu {
>  	bool				dirty_page_tracking;
>  	bool				pinned_page_dirty_scope;
>  	struct iommu_nesting_info	*nesting_info;
> +	struct vfio_mm			*vmm;
>  };
>  
>  struct vfio_domain {
> @@ -2000,6 +2001,11 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
>  
>  static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu)
>  {
> +	if (iommu->vmm) {
> +		vfio_mm_put(iommu->vmm);
> +		iommu->vmm = NULL;
> +	}
> +
>  	kfree(iommu->nesting_info);
>  	iommu->nesting_info = NULL;
>  }
> @@ -2127,6 +2133,26 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  					    iommu->nesting_info);
>  		if (ret)
>  			goto out_detach;
> +
> +		if (iommu->nesting_info->features &
> +					IOMMU_NESTING_FEAT_SYSWIDE_PASID) {
> +			struct vfio_mm *vmm;
> +			struct ioasid_set *set;
> +
> +			vmm = vfio_mm_get_from_task(current);
> +			if (IS_ERR(vmm)) {
> +				ret = PTR_ERR(vmm);
> +				goto out_detach;
> +			}
> +			iommu->vmm = vmm;
> +
> +			set = vfio_mm_ioasid_set(vmm);
> +			ret = iommu_domain_set_attr(domain->domain,
> +						    DOMAIN_ATTR_IOASID_SET,
> +						    set);
> +			if (ret)
> +				goto out_detach;
> +		}
>  	}
>  
>  	/* Get aperture info */
> @@ -2908,6 +2934,54 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu,
>  	return -EINVAL;
>  }
>  
> +static int vfio_iommu_type1_pasid_request(struct vfio_iommu *iommu,
> +					  unsigned long arg)
> +{
> +	struct vfio_iommu_type1_pasid_request req;
> +	unsigned long minsz;
> +	int ret;
> +
> +	minsz = offsetofend(struct vfio_iommu_type1_pasid_request, range);
> +
> +	if (copy_from_user(&req, (void __user *)arg, minsz))
> +		return -EFAULT;
> +
> +	if (req.argsz < minsz || (req.flags & ~VFIO_PASID_REQUEST_MASK))
> +		return -EINVAL;
> +
> +	/*
> +	 * Current VFIO_IOMMU_PASID_REQUEST only supports at most
> +	 * 31 bits PASID. The min,max value from userspace should
> +	 * not exceed 31 bits.

Please describe the source of this restriction.  I think it's due to
using the ioctl return value to return the PASID, thus excluding the
negative values, but aren't we actually restricted to pasid_bits
exposed in the nesting_info?  If this is just a sanity test for the API
then why are we defining VFIO_IOMMU_PASID_BITS in the uapi header,
which causes conflicting information to the user... which do they
honor?  Should we instead verify that pasid_bits matches our API scheme
when configuring the nested domain and then let the ioasid allocator
reject requests outside of the range?

> +	 */
> +	if (req.range.min > req.range.max ||
> +	    req.range.min > (1 << VFIO_IOMMU_PASID_BITS) ||
> +	    req.range.max > (1 << VFIO_IOMMU_PASID_BITS))

Off by one, >= for the bit test.

> +		return -EINVAL;
> +
> +	mutex_lock(&iommu->lock);
> +	if (!iommu->vmm) {
> +		mutex_unlock(&iommu->lock);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	switch (req.flags & VFIO_PASID_REQUEST_MASK) {
> +	case VFIO_IOMMU_FLAG_ALLOC_PASID:
> +		ret = vfio_pasid_alloc(iommu->vmm, req.range.min,
> +				       req.range.max);
> +		break;
> +	case VFIO_IOMMU_FLAG_FREE_PASID:
> +		vfio_pasid_free_range(iommu->vmm, req.range.min,
> +				      req.range.max);
> +		ret = 0;

Set the initial value when it's declared?

> +		break;
> +	default:
> +		ret = -EINVAL;
> +	}
> +	mutex_unlock(&iommu->lock);
> +	return ret;
> +}
> +
>  static long vfio_iommu_type1_ioctl(void *iommu_data,
>  				   unsigned int cmd, unsigned long arg)
>  {
> @@ -2924,6 +2998,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  		return vfio_iommu_type1_unmap_dma(iommu, arg);
>  	case VFIO_IOMMU_DIRTY_PAGES:
>  		return vfio_iommu_type1_dirty_pages(iommu, arg);
> +	case VFIO_IOMMU_PASID_REQUEST:
> +		return vfio_iommu_type1_pasid_request(iommu, arg);
>  	default:
>  		return -ENOTTY;
>  	}
> diff --git a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c
> index 44ecdd5..0ec4660 100644
> --- a/drivers/vfio/vfio_pasid.c
> +++ b/drivers/vfio/vfio_pasid.c
> @@ -60,6 +60,7 @@ void vfio_mm_put(struct vfio_mm *vmm)
>  {
>  	kref_put_mutex(&vmm->kref, vfio_mm_release, &vfio_mm_lock);
>  }
> +EXPORT_SYMBOL_GPL(vfio_mm_put);
>  
>  static void vfio_mm_get(struct vfio_mm *vmm)
>  {
> @@ -113,6 +114,13 @@ struct vfio_mm *vfio_mm_get_from_task(struct task_struct *task)
>  	mmput(mm);
>  	return vmm;
>  }
> +EXPORT_SYMBOL_GPL(vfio_mm_get_from_task);
> +
> +struct ioasid_set *vfio_mm_ioasid_set(struct vfio_mm *vmm)
> +{
> +	return vmm->ioasid_set;
> +}
> +EXPORT_SYMBOL_GPL(vfio_mm_ioasid_set);
>  
>  /*
>   * Find PASID within @min and @max
> @@ -201,6 +209,7 @@ int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max)
>  
>  	return pasid;
>  }
> +EXPORT_SYMBOL_GPL(vfio_pasid_alloc);
>  
>  void vfio_pasid_free_range(struct vfio_mm *vmm,
>  			   ioasid_t min, ioasid_t max)
> @@ -217,6 +226,7 @@ void vfio_pasid_free_range(struct vfio_mm *vmm,
>  		vfio_remove_pasid(vmm, vid);
>  	mutex_unlock(&vmm->pasid_lock);
>  }
> +EXPORT_SYMBOL_GPL(vfio_pasid_free_range);
>  
>  static int __init vfio_pasid_init(void)
>  {
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 31472a9..5c3d7a8 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -101,6 +101,7 @@ struct vfio_mm;
>  #if IS_ENABLED(CONFIG_VFIO_PASID)
>  extern struct vfio_mm *vfio_mm_get_from_task(struct task_struct *task);
>  extern void vfio_mm_put(struct vfio_mm *vmm);
> +extern struct ioasid_set *vfio_mm_ioasid_set(struct vfio_mm *vmm);
>  extern int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max);
>  extern void vfio_pasid_free_range(struct vfio_mm *vmm,
>  				  ioasid_t min, ioasid_t max);
> @@ -114,6 +115,11 @@ static inline void vfio_mm_put(struct vfio_mm *vmm)
>  {
>  }
>  
> +static inline struct ioasid_set *vfio_mm_ioasid_set(struct vfio_mm *vmm)
> +{
> +	return -ENOTTY;
> +}
> +
>  static inline int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max)
>  {
>  	return -ENOTTY;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index ff40f9e..a4bc42e 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1172,6 +1172,49 @@ struct vfio_iommu_type1_dirty_bitmap_get {
>  
>  #define VFIO_IOMMU_DIRTY_PAGES             _IO(VFIO_TYPE, VFIO_BASE + 17)
>  
> +/**
> + * VFIO_IOMMU_PASID_REQUEST - _IOWR(VFIO_TYPE, VFIO_BASE + 18,
> + *				struct vfio_iommu_type1_pasid_request)
> + *
> + * PASID (Processor Address Space ID) is a PCIe concept for tagging
> + * address spaces in DMA requests. When system-wide PASID allocation
> + * is required by the underlying iommu driver (e.g. Intel VT-d), this
> + * provides an interface for userspace to request pasid alloc/free
> + * for its assigned devices. Userspace should check the availability
> + * of this API by checking VFIO_IOMMU_TYPE1_INFO_CAP_NESTING through
> + * VFIO_IOMMU_GET_INFO.
> + *
> + * @flags=VFIO_IOMMU_FLAG_ALLOC_PASID, allocate a single PASID within @range.
> + * @flags=VFIO_IOMMU_FLAG_FREE_PASID, free the PASIDs within @range.
> + * @range is [min, max], which means both @min and @max are inclusive.
> + * ALLOC_PASID and FREE_PASID are mutually exclusive.
> + *
> + * Current interface supports at most 31 bits PASID bits as returning
> + * PASID allocation result via signed int. PCIe spec defines 20 bits
> + * for PASID width, so 31 bits is enough. As a result user space should
> + * provide min, max no more than 31 bits.

Perhaps this is the description I was looking for, but this still
conflicts with what I think the user is supposed to do, which is to
provide a range within nesting_info.pasid_bits.  These seem like
implementation details, not uapi.  Thanks,

Alex

> + * returns: allocated PASID value on success, -errno on failure for
> + *	    ALLOC_PASID;
> + *	    0 for FREE_PASID operation;
> + */
> +struct vfio_iommu_type1_pasid_request {
> +	__u32	argsz;
> +#define VFIO_IOMMU_FLAG_ALLOC_PASID	(1 << 0)
> +#define VFIO_IOMMU_FLAG_FREE_PASID	(1 << 1)
> +	__u32	flags;
> +	struct {
> +		__u32	min;
> +		__u32	max;
> +	} range;
> +};
> +
> +#define VFIO_PASID_REQUEST_MASK	(VFIO_IOMMU_FLAG_ALLOC_PASID | \
> +					 VFIO_IOMMU_FLAG_FREE_PASID)
> +
> +#define VFIO_IOMMU_PASID_BITS		31
> +
> +#define VFIO_IOMMU_PASID_REQUEST	_IO(VFIO_TYPE, VFIO_BASE + 18)
> +
>  /* -------- Additional API for SPAPR TCE (Server POWERPC) IOMMU -------- */
>  
>  /*


WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com>
To: Liu Yi L <yi.l.liu@intel.com>
Cc: jean-philippe@linaro.org, kevin.tian@intel.com,
	ashok.raj@intel.com, kvm@vger.kernel.org, jasowang@redhat.com,
	stefanha@gmail.com, iommu@lists.linux-foundation.org,
	yi.y.sun@intel.com, hao.wu@intel.com, jun.j.tian@intel.com
Subject: Re: [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)
Date: Fri, 11 Sep 2020 15:38:06 -0600	[thread overview]
Message-ID: <20200911153806.6dda06b9@w520.home> (raw)
In-Reply-To: <1599734733-6431-8-git-send-email-yi.l.liu@intel.com>

On Thu, 10 Sep 2020 03:45:24 -0700
Liu Yi L <yi.l.liu@intel.com> wrote:

> This patch allows userspace to request PASID allocation/free, e.g. when
> serving the request from the guest.
> 
> PASIDs that are not freed by userspace are automatically freed when the
> IOASID set is destroyed when process exits.
> 
> Cc: Kevin Tian <kevin.tian@intel.com>
> CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
> v6 -> v7:
> *) current VFIO returns allocated pasid via signed int, thus VFIO UAPI
>    can only support 31 bits pasid. If user space gives min,max which is
>    wider than 31 bits, should fail the allocation or free request.
> 
> v5 -> v6:
> *) address comments from Eric against v5. remove the alloc/free helper.
> 
> v4 -> v5:
> *) address comments from Eric Auger.
> *) the comments for the PASID_FREE request is addressed in patch 5/15 of
>    this series.
> 
> v3 -> v4:
> *) address comments from v3, except the below comment against the range
>    of PASID_FREE request. needs more help on it.
>     "> +if (req.range.min > req.range.max)  
> 
>      Is it exploitable that a user can spin the kernel for a long time in
>      the case of a free by calling this with [0, MAX_UINT] regardless of
>      their actual allocations?"
>     https://lore.kernel.org/linux-iommu/20200702151832.048b44d1@x1.home/
> 
> v1 -> v2:
> *) move the vfio_mm related code to be a seprate module
> *) use a single structure for alloc/free, could support a range of PASIDs
> *) fetch vfio_mm at group_attach time instead of at iommu driver open time
> ---
>  drivers/vfio/Kconfig            |  1 +
>  drivers/vfio/vfio_iommu_type1.c | 76 +++++++++++++++++++++++++++++++++++++++++
>  drivers/vfio/vfio_pasid.c       | 10 ++++++
>  include/linux/vfio.h            |  6 ++++
>  include/uapi/linux/vfio.h       | 43 +++++++++++++++++++++++
>  5 files changed, 136 insertions(+)
> 
> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> index 3d8a108..95d90c6 100644
> --- a/drivers/vfio/Kconfig
> +++ b/drivers/vfio/Kconfig
> @@ -2,6 +2,7 @@
>  config VFIO_IOMMU_TYPE1
>  	tristate
>  	depends on VFIO
> +	select VFIO_PASID if (X86)
>  	default n
>  
>  config VFIO_IOMMU_SPAPR_TCE
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 3c0048b..bd4b668 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -76,6 +76,7 @@ struct vfio_iommu {
>  	bool				dirty_page_tracking;
>  	bool				pinned_page_dirty_scope;
>  	struct iommu_nesting_info	*nesting_info;
> +	struct vfio_mm			*vmm;
>  };
>  
>  struct vfio_domain {
> @@ -2000,6 +2001,11 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
>  
>  static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu)
>  {
> +	if (iommu->vmm) {
> +		vfio_mm_put(iommu->vmm);
> +		iommu->vmm = NULL;
> +	}
> +
>  	kfree(iommu->nesting_info);
>  	iommu->nesting_info = NULL;
>  }
> @@ -2127,6 +2133,26 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  					    iommu->nesting_info);
>  		if (ret)
>  			goto out_detach;
> +
> +		if (iommu->nesting_info->features &
> +					IOMMU_NESTING_FEAT_SYSWIDE_PASID) {
> +			struct vfio_mm *vmm;
> +			struct ioasid_set *set;
> +
> +			vmm = vfio_mm_get_from_task(current);
> +			if (IS_ERR(vmm)) {
> +				ret = PTR_ERR(vmm);
> +				goto out_detach;
> +			}
> +			iommu->vmm = vmm;
> +
> +			set = vfio_mm_ioasid_set(vmm);
> +			ret = iommu_domain_set_attr(domain->domain,
> +						    DOMAIN_ATTR_IOASID_SET,
> +						    set);
> +			if (ret)
> +				goto out_detach;
> +		}
>  	}
>  
>  	/* Get aperture info */
> @@ -2908,6 +2934,54 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu,
>  	return -EINVAL;
>  }
>  
> +static int vfio_iommu_type1_pasid_request(struct vfio_iommu *iommu,
> +					  unsigned long arg)
> +{
> +	struct vfio_iommu_type1_pasid_request req;
> +	unsigned long minsz;
> +	int ret;
> +
> +	minsz = offsetofend(struct vfio_iommu_type1_pasid_request, range);
> +
> +	if (copy_from_user(&req, (void __user *)arg, minsz))
> +		return -EFAULT;
> +
> +	if (req.argsz < minsz || (req.flags & ~VFIO_PASID_REQUEST_MASK))
> +		return -EINVAL;
> +
> +	/*
> +	 * Current VFIO_IOMMU_PASID_REQUEST only supports at most
> +	 * 31 bits PASID. The min,max value from userspace should
> +	 * not exceed 31 bits.

Please describe the source of this restriction.  I think it's due to
using the ioctl return value to return the PASID, thus excluding the
negative values, but aren't we actually restricted to pasid_bits
exposed in the nesting_info?  If this is just a sanity test for the API
then why are we defining VFIO_IOMMU_PASID_BITS in the uapi header,
which causes conflicting information to the user... which do they
honor?  Should we instead verify that pasid_bits matches our API scheme
when configuring the nested domain and then let the ioasid allocator
reject requests outside of the range?

> +	 */
> +	if (req.range.min > req.range.max ||
> +	    req.range.min > (1 << VFIO_IOMMU_PASID_BITS) ||
> +	    req.range.max > (1 << VFIO_IOMMU_PASID_BITS))

Off by one, >= for the bit test.

> +		return -EINVAL;
> +
> +	mutex_lock(&iommu->lock);
> +	if (!iommu->vmm) {
> +		mutex_unlock(&iommu->lock);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	switch (req.flags & VFIO_PASID_REQUEST_MASK) {
> +	case VFIO_IOMMU_FLAG_ALLOC_PASID:
> +		ret = vfio_pasid_alloc(iommu->vmm, req.range.min,
> +				       req.range.max);
> +		break;
> +	case VFIO_IOMMU_FLAG_FREE_PASID:
> +		vfio_pasid_free_range(iommu->vmm, req.range.min,
> +				      req.range.max);
> +		ret = 0;

Set the initial value when it's declared?

> +		break;
> +	default:
> +		ret = -EINVAL;
> +	}
> +	mutex_unlock(&iommu->lock);
> +	return ret;
> +}
> +
>  static long vfio_iommu_type1_ioctl(void *iommu_data,
>  				   unsigned int cmd, unsigned long arg)
>  {
> @@ -2924,6 +2998,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  		return vfio_iommu_type1_unmap_dma(iommu, arg);
>  	case VFIO_IOMMU_DIRTY_PAGES:
>  		return vfio_iommu_type1_dirty_pages(iommu, arg);
> +	case VFIO_IOMMU_PASID_REQUEST:
> +		return vfio_iommu_type1_pasid_request(iommu, arg);
>  	default:
>  		return -ENOTTY;
>  	}
> diff --git a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c
> index 44ecdd5..0ec4660 100644
> --- a/drivers/vfio/vfio_pasid.c
> +++ b/drivers/vfio/vfio_pasid.c
> @@ -60,6 +60,7 @@ void vfio_mm_put(struct vfio_mm *vmm)
>  {
>  	kref_put_mutex(&vmm->kref, vfio_mm_release, &vfio_mm_lock);
>  }
> +EXPORT_SYMBOL_GPL(vfio_mm_put);
>  
>  static void vfio_mm_get(struct vfio_mm *vmm)
>  {
> @@ -113,6 +114,13 @@ struct vfio_mm *vfio_mm_get_from_task(struct task_struct *task)
>  	mmput(mm);
>  	return vmm;
>  }
> +EXPORT_SYMBOL_GPL(vfio_mm_get_from_task);
> +
> +struct ioasid_set *vfio_mm_ioasid_set(struct vfio_mm *vmm)
> +{
> +	return vmm->ioasid_set;
> +}
> +EXPORT_SYMBOL_GPL(vfio_mm_ioasid_set);
>  
>  /*
>   * Find PASID within @min and @max
> @@ -201,6 +209,7 @@ int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max)
>  
>  	return pasid;
>  }
> +EXPORT_SYMBOL_GPL(vfio_pasid_alloc);
>  
>  void vfio_pasid_free_range(struct vfio_mm *vmm,
>  			   ioasid_t min, ioasid_t max)
> @@ -217,6 +226,7 @@ void vfio_pasid_free_range(struct vfio_mm *vmm,
>  		vfio_remove_pasid(vmm, vid);
>  	mutex_unlock(&vmm->pasid_lock);
>  }
> +EXPORT_SYMBOL_GPL(vfio_pasid_free_range);
>  
>  static int __init vfio_pasid_init(void)
>  {
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 31472a9..5c3d7a8 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -101,6 +101,7 @@ struct vfio_mm;
>  #if IS_ENABLED(CONFIG_VFIO_PASID)
>  extern struct vfio_mm *vfio_mm_get_from_task(struct task_struct *task);
>  extern void vfio_mm_put(struct vfio_mm *vmm);
> +extern struct ioasid_set *vfio_mm_ioasid_set(struct vfio_mm *vmm);
>  extern int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max);
>  extern void vfio_pasid_free_range(struct vfio_mm *vmm,
>  				  ioasid_t min, ioasid_t max);
> @@ -114,6 +115,11 @@ static inline void vfio_mm_put(struct vfio_mm *vmm)
>  {
>  }
>  
> +static inline struct ioasid_set *vfio_mm_ioasid_set(struct vfio_mm *vmm)
> +{
> +	return -ENOTTY;
> +}
> +
>  static inline int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max)
>  {
>  	return -ENOTTY;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index ff40f9e..a4bc42e 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1172,6 +1172,49 @@ struct vfio_iommu_type1_dirty_bitmap_get {
>  
>  #define VFIO_IOMMU_DIRTY_PAGES             _IO(VFIO_TYPE, VFIO_BASE + 17)
>  
> +/**
> + * VFIO_IOMMU_PASID_REQUEST - _IOWR(VFIO_TYPE, VFIO_BASE + 18,
> + *				struct vfio_iommu_type1_pasid_request)
> + *
> + * PASID (Processor Address Space ID) is a PCIe concept for tagging
> + * address spaces in DMA requests. When system-wide PASID allocation
> + * is required by the underlying iommu driver (e.g. Intel VT-d), this
> + * provides an interface for userspace to request pasid alloc/free
> + * for its assigned devices. Userspace should check the availability
> + * of this API by checking VFIO_IOMMU_TYPE1_INFO_CAP_NESTING through
> + * VFIO_IOMMU_GET_INFO.
> + *
> + * @flags=VFIO_IOMMU_FLAG_ALLOC_PASID, allocate a single PASID within @range.
> + * @flags=VFIO_IOMMU_FLAG_FREE_PASID, free the PASIDs within @range.
> + * @range is [min, max], which means both @min and @max are inclusive.
> + * ALLOC_PASID and FREE_PASID are mutually exclusive.
> + *
> + * Current interface supports at most 31 bits PASID bits as returning
> + * PASID allocation result via signed int. PCIe spec defines 20 bits
> + * for PASID width, so 31 bits is enough. As a result user space should
> + * provide min, max no more than 31 bits.

Perhaps this is the description I was looking for, but this still
conflicts with what I think the user is supposed to do, which is to
provide a range within nesting_info.pasid_bits.  These seem like
implementation details, not uapi.  Thanks,

Alex

> + * returns: allocated PASID value on success, -errno on failure for
> + *	    ALLOC_PASID;
> + *	    0 for FREE_PASID operation;
> + */
> +struct vfio_iommu_type1_pasid_request {
> +	__u32	argsz;
> +#define VFIO_IOMMU_FLAG_ALLOC_PASID	(1 << 0)
> +#define VFIO_IOMMU_FLAG_FREE_PASID	(1 << 1)
> +	__u32	flags;
> +	struct {
> +		__u32	min;
> +		__u32	max;
> +	} range;
> +};
> +
> +#define VFIO_PASID_REQUEST_MASK	(VFIO_IOMMU_FLAG_ALLOC_PASID | \
> +					 VFIO_IOMMU_FLAG_FREE_PASID)
> +
> +#define VFIO_IOMMU_PASID_BITS		31
> +
> +#define VFIO_IOMMU_PASID_REQUEST	_IO(VFIO_TYPE, VFIO_BASE + 18)
> +
>  /* -------- Additional API for SPAPR TCE (Server POWERPC) IOMMU -------- */
>  
>  /*

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2020-09-11 21:38 UTC|newest]

Thread overview: 165+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-10 10:45 [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs Liu Yi L
2020-09-10 10:45 ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 01/16] iommu: Report domain nesting info Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-11 19:38   ` Alex Williamson
2020-09-11 19:38     ` Alex Williamson
2020-09-10 10:45 ` [PATCH v7 02/16] iommu/smmu: Report empty " Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2021-01-12  6:50   ` Vivek Gautam
2021-01-12  6:50     ` Vivek Gautam
2021-01-12  9:21     ` Liu, Yi L
2021-01-12  9:21       ` Liu, Yi L
2021-01-12 11:05       ` Vivek Gautam
2021-01-12 11:05         ` Vivek Gautam
2021-01-13  5:56         ` Liu, Yi L
2021-01-13  5:56           ` Liu, Yi L
2021-01-19 10:03           ` Auger Eric
2021-01-19 10:03             ` Auger Eric
2021-01-23  8:59             ` Liu, Yi L
2021-01-23  8:59               ` Liu, Yi L
2021-02-12  7:14               ` Vivek Gautam
2021-02-12  7:14                 ` Vivek Gautam
2021-02-12  9:57                 ` Auger Eric
2021-02-12  9:57                   ` Auger Eric
2021-02-12 10:18                   ` Vivek Kumar Gautam
2021-02-12 10:18                     ` Vivek Kumar Gautam
2021-02-12 11:01                     ` Vivek Kumar Gautam
2021-02-12 11:01                       ` Vivek Kumar Gautam
2021-03-03  9:44                   ` Liu, Yi L
2021-03-03  9:44                     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 03/16] vfio/type1: Report iommu nesting info to userspace Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-11 20:16   ` Alex Williamson
2020-09-11 20:16     ` Alex Williamson
2020-09-12  8:24     ` Liu, Yi L
2020-09-12  8:24       ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 04/16] vfio: Add PASID allocation/free support Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-11 20:54   ` Alex Williamson
2020-09-11 20:54     ` Alex Williamson
2020-09-15  4:03     ` Liu, Yi L
2020-09-15  4:03       ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 05/16] iommu/vt-d: Support setting ioasid set to domain Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 06/16] iommu/vt-d: Remove get_task_mm() in bind_gpasid() Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free) Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-11 21:38   ` Alex Williamson [this message]
2020-09-11 21:38     ` Alex Williamson
2020-09-12  6:17     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 08/16] iommu: Pass domain to sva_unbind_gpasid() Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 09/16] iommu/vt-d: Check ownership for PASIDs from user-space Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 10/16] vfio/type1: Support binding guest page tables to PASID Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-11 22:03   ` Alex Williamson
2020-09-11 22:03     ` Alex Williamson
2020-09-12  6:02     ` Liu, Yi L
2020-09-12  6:02       ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 11/16] vfio/type1: Allow invalidating first-level/stage IOMMU cache Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 12/16] vfio/type1: Add vSVA support for IOMMU-backed mdevs Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 13/16] vfio/pci: Expose PCIe PASID capability to guest Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-11 22:13   ` Alex Williamson
2020-09-11 22:13     ` Alex Williamson
2020-09-12  7:17     ` Liu, Yi L
2020-09-12  7:17       ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 14/16] vfio: Document dual stage control Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 15/16] iommu/vt-d: Only support nesting when nesting caps are consistent across iommu units Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-10 10:45 ` [PATCH v7 16/16] iommu/vt-d: Support reporting nesting capability info Liu Yi L
2020-09-10 10:45   ` Liu Yi L
2020-09-14  4:20 ` [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs Jason Wang
2020-09-14  4:20   ` Jason Wang
2020-09-14  8:01   ` Tian, Kevin
2020-09-14  8:01     ` Tian, Kevin
2020-09-14  8:57     ` Jason Wang
2020-09-14  8:57       ` Jason Wang
2020-09-14 10:38       ` Tian, Kevin
2020-09-14 10:38         ` Tian, Kevin
2020-09-14 11:38         ` Jason Gunthorpe
2020-09-14 11:38           ` Jason Gunthorpe
2020-09-14 13:31   ` Jean-Philippe Brucker
2020-09-14 13:31     ` Jean-Philippe Brucker
2020-09-14 13:47     ` Jason Gunthorpe
2020-09-14 13:47       ` Jason Gunthorpe
2020-09-14 16:22       ` Raj, Ashok
2020-09-14 16:22         ` Raj, Ashok
2020-09-14 16:33         ` Jason Gunthorpe
2020-09-14 16:33           ` Jason Gunthorpe
2020-09-14 16:58           ` Alex Williamson
2020-09-14 16:58             ` Alex Williamson
2020-09-14 17:41             ` Jason Gunthorpe
2020-09-14 17:41               ` Jason Gunthorpe
2020-09-14 18:23               ` Alex Williamson
2020-09-14 18:23                 ` Alex Williamson
2020-09-14 19:00                 ` Jason Gunthorpe
2020-09-14 19:00                   ` Jason Gunthorpe
2020-09-14 22:33                   ` Alex Williamson
2020-09-14 22:33                     ` Alex Williamson
2020-09-15 14:29                     ` Jason Gunthorpe
2020-09-15 14:29                       ` Jason Gunthorpe
2020-09-16  1:19                       ` Tian, Kevin
2020-09-16  1:19                         ` Tian, Kevin
2020-09-16  8:32                         ` Jean-Philippe Brucker
2020-09-16  8:32                           ` Jean-Philippe Brucker
2020-09-16 14:51                           ` Jason Gunthorpe
2020-09-16 14:51                             ` Jason Gunthorpe
2020-09-16 16:20                             ` Jean-Philippe Brucker
2020-09-16 16:20                               ` Jean-Philippe Brucker
2020-09-16 16:32                               ` Jason Gunthorpe
2020-09-16 16:32                                 ` Jason Gunthorpe
2020-09-16 16:50                                 ` Auger Eric
2020-09-16 16:50                                   ` Auger Eric
2020-09-16 14:44                         ` Jason Gunthorpe
2020-09-16 14:44                           ` Jason Gunthorpe
2020-09-17  6:01                           ` Tian, Kevin
2020-09-17  6:01                             ` Tian, Kevin
2020-09-14 22:44                   ` Raj, Ashok
2020-09-15 11:33                     ` Jason Gunthorpe
2020-09-15 11:33                       ` Jason Gunthorpe
2020-09-15 18:11                       ` Raj, Ashok
2020-09-15 18:11                         ` Raj, Ashok
2020-09-15 18:45                         ` Jason Gunthorpe
2020-09-15 18:45                           ` Jason Gunthorpe
2020-09-15 19:26                           ` Raj, Ashok
2020-09-15 19:26                             ` Raj, Ashok
2020-09-15 23:45                             ` Jason Gunthorpe
2020-09-15 23:45                               ` Jason Gunthorpe
2020-09-16  2:33                             ` Jason Wang
2020-09-16  2:33                               ` Jason Wang
2020-09-15 22:08                           ` Jacob Pan
2020-09-15 22:08                             ` Jacob Pan
2020-09-15 23:51                             ` Jason Gunthorpe
2020-09-15 23:51                               ` Jason Gunthorpe
2020-09-16  0:22                               ` Jacob Pan (Jun)
2020-09-16  1:46                                 ` Lu Baolu
2020-09-16  1:46                                   ` Lu Baolu
2020-09-16 15:07                                 ` Jason Gunthorpe
2020-09-16 15:07                                   ` Jason Gunthorpe
2020-09-16 16:33                                   ` Raj, Ashok
2020-09-16 16:33                                     ` Raj, Ashok
2020-09-16 17:01                                     ` Jason Gunthorpe
2020-09-16 17:01                                       ` Jason Gunthorpe
2020-09-16 18:21                                       ` Jacob Pan (Jun)
2020-09-16 18:21                                         ` Jacob Pan (Jun)
2020-09-16 18:38                                         ` Jason Gunthorpe
2020-09-16 18:38                                           ` Jason Gunthorpe
2020-09-16 23:09                                           ` Jacob Pan (Jun)
2020-09-16 23:09                                             ` Jacob Pan (Jun)
2020-09-17  3:53                                             ` Jason Wang
2020-09-17  3:53                                               ` Jason Wang
2020-09-17 17:31                                               ` Jason Gunthorpe
2020-09-17 17:31                                                 ` Jason Gunthorpe
2020-09-17 18:17                                               ` Jacob Pan (Jun)
2020-09-17 18:17                                                 ` Jacob Pan (Jun)
2020-09-18  3:58                                                 ` Jason Wang
2020-09-18  3:58                                                   ` Jason Wang
2020-09-16  2:29     ` Jason Wang
2020-09-16  2:29       ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200911153806.6dda06b9@w520.home \
    --to=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=eric.auger@redhat.com \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jasowang@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=stefanha@gmail.com \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.