linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] vfio/iommu_type1: report the IOMMU aperture info
@ 2017-11-30 11:34 Pierre Morel
  2017-11-30 12:57 ` Jean-Philippe Brucker
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Pierre Morel @ 2017-11-30 11:34 UTC (permalink / raw)
  To: alex.williamson; +Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

When userland VFIO defines a new IOMMU for a guest it may
want to specify to the guest the physical limits of
the underlying host IOMMU to avoid access to forbidden
memory ranges.

Currently, the vfio_iommu_type1 driver does not report this
information to userland.

Let's extend the vfio_iommu_type1_info structure reported
by the ioctl VFIO_IOMMU_GET_INFO command to report the
IOMMU limits as new uint64_t entries aperture_start and
aperture_end.

Let's also extend the flags bit map to add a flag specifying
if this extension of the info structure is reported or not.

Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
---
 drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vfio.h       |  3 +++
 2 files changed, 45 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 8549cb1..7da5fe0 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
 	return ret;
 }
 
+/**
+ * vfio_get_aperture - report minimal aperture of a vfio_iommu
+ * @iommu: the current vfio_iommu
+ * @start: a pointer to the aperture start
+ * @end  : a pointer to the aperture end
+ *
+ * This function iterate on the domains using the given vfio_iommu
+ * and restrict the aperture to the minimal aperture common
+ * to all domains sharing this vfio_iommu.
+ */
+static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
+				uint64_t *end)
+{
+	struct iommu_domain_geometry geometry;
+	struct vfio_domain *domain;
+
+	*start = 0;
+	*end = U64_MAX;
+
+	mutex_lock(&iommu->lock);
+	/* loop on all domains using this vfio_iommu */
+	list_for_each_entry(domain, &iommu->domain_list, next) {
+		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
+					&geometry);
+		if (geometry.force_aperture) {
+			if (geometry.aperture_start > *start)
+				*start = geometry.aperture_start;
+			if (geometry.aperture_end < *end)
+				*end = geometry.aperture_end;
+		}
+	}
+	mutex_unlock(&iommu->lock);
+}
+
 static long vfio_iommu_type1_ioctl(void *iommu_data,
 				   unsigned int cmd, unsigned long arg)
 {
@@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
+		minsz = min_t(size_t, info.argsz, sizeof(info));
+		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
+					 aperture_end)) {
+			info.flags |= VFIO_IOMMU_INFO_APERTURE;
+			vfio_get_aperture(iommu, &info.aperture_start,
+					  &info.aperture_end);
+		}
+
 		return copy_to_user((void __user *)arg, &info, minsz) ?
 			-EFAULT : 0;
 
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 0fb25fb..780d909 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
+#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
+	__u64   aperture_start;		/* start of DMA aperture */
+	__u64   aperture_end;		/* end of DMA aperture */
 };
 
 #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 11:34 [PATCH] vfio/iommu_type1: report the IOMMU aperture info Pierre Morel
@ 2017-11-30 12:57 ` Jean-Philippe Brucker
  2017-11-30 14:16   ` Pierre Morel
  2017-11-30 19:02   ` Alex Williamson
  2017-11-30 13:22 ` Auger Eric
  2017-11-30 14:08 ` Alex Williamson
  2 siblings, 2 replies; 15+ messages in thread
From: Jean-Philippe Brucker @ 2017-11-30 12:57 UTC (permalink / raw)
  To: Pierre Morel, alex.williamson
  Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

Hello,

On 30/11/17 11:34, Pierre Morel wrote:
[...]
> +/**
> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
> + * @iommu: the current vfio_iommu
> + * @start: a pointer to the aperture start
> + * @end  : a pointer to the aperture end
> + *
> + * This function iterate on the domains using the given vfio_iommu
> + * and restrict the aperture to the minimal aperture common
> + * to all domains sharing this vfio_iommu.
> + */
> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
> +				uint64_t *end)
> +{
> +	struct iommu_domain_geometry geometry;
> +	struct vfio_domain *domain;
> +
> +	*start = 0;
> +	*end = U64_MAX;

I wonder if the default values should also reflect what the VFIO
implementation actually supports. Looking at vfio_dma_do_map, a 32-bit
host will reject any iova greater than 32 bits. In addition,
vfio_dma_do_unmap doesn't support unmapping the last page of a 64-bit
address space (existing IOMMUs would probably reject map requests with
IOVA > 52 bits anyway, but if they don't report a domain aperture, VFIO
can't guess it).

I think it's convenient to use VFIO_IOMMU_UNMAP_DMA on the full address
space when an unmap-all is needed, maybe we could provide default aperture
values that help doing this? (~0U for 32-bit and (~0ULL - PAGE_SIZE) for
64-bit)

Thanks,
Jean

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 11:34 [PATCH] vfio/iommu_type1: report the IOMMU aperture info Pierre Morel
  2017-11-30 12:57 ` Jean-Philippe Brucker
@ 2017-11-30 13:22 ` Auger Eric
  2017-11-30 14:17   ` Pierre Morel
  2017-11-30 14:08 ` Alex Williamson
  2 siblings, 1 reply; 15+ messages in thread
From: Auger Eric @ 2017-11-30 13:22 UTC (permalink / raw)
  To: Pierre Morel, alex.williamson
  Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

Hi Pierre,
On 30/11/17 12:34, Pierre Morel wrote:
> When userland VFIO defines a new IOMMU for a guest it may
> want to specify to the guest the physical limits of
> the underlying host IOMMU to avoid access to forbidden
> memory ranges.
> 
> Currently, the vfio_iommu_type1 driver does not report this
> information to userland.
> 
> Let's extend the vfio_iommu_type1_info structure reported
> by the ioctl VFIO_IOMMU_GET_INFO command to report the
> IOMMU limits as new uint64_t entries aperture_start and
> aperture_end.
> 
> Let's also extend the flags bit map to add a flag specifying
> if this extension of the info structure is reported or not.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/vfio.h       |  3 +++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 8549cb1..7da5fe0 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>  	return ret;
>  }
>  
> +/**
> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
> + * @iommu: the current vfio_iommu
> + * @start: a pointer to the aperture start
> + * @end  : a pointer to the aperture end
> + *
> + * This function iterate on the domains using the given vfio_iommu
> + * and restrict the aperture to the minimal aperture common
> + * to all domains sharing this vfio_iommu.
> + */
> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
> +				uint64_t *end)
> +{
> +	struct iommu_domain_geometry geometry;
> +	struct vfio_domain *domain;
> +
> +	*start = 0;
> +	*end = U64_MAX;
> +
> +	mutex_lock(&iommu->lock);
> +	/* loop on all domains using this vfio_iommu */
> +	list_for_each_entry(domain, &iommu->domain_list, next) {
> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
> +					&geometry);
> +		if (geometry.force_aperture) {
> +			if (geometry.aperture_start > *start)
> +				*start = geometry.aperture_start;
> +			if (geometry.aperture_end < *end)
> +				*end = geometry.aperture_end;
> +		}
> +	}
> +	mutex_unlock(&iommu->lock);
> +}
> +
>  static long vfio_iommu_type1_ioctl(void *iommu_data,
>  				   unsigned int cmd, unsigned long arg)
>  {
> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  
>  		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>  
> +		minsz = min_t(size_t, info.argsz, sizeof(info));
> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
> +					 aperture_end)) {
> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
> +			vfio_get_aperture(iommu, &info.aperture_start,
> +					  &info.aperture_end);
> +		}
> +
>  		return copy_to_user((void __user *)arg, &info, minsz) ?
>  			-EFAULT : 0;
>  
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 0fb25fb..780d909 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
>  	__u32	flags;
>  #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
>  	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
> +	__u64   aperture_start;		/* start of DMA aperture */
> +	__u64   aperture_end;		/* end of DMA aperture */
>  };
>  
>  #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
> 

In the past I was told by Alex that using the capability chain
extensions to add some new capabilities was a better approach:

https://lkml.org/lkml/2016/4/6/751

I did that kind of exercise in
[PATCH v9 7/7] vfio/type1: return MSI geometry through
VFIO_IOMMU_GET_INFO capability chains

https://lists.linuxfoundation.org/pipermail/iommu/2016-May/016892.html

The patch was not used eventually but maybe that can help.

Thanks

Eric

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 11:34 [PATCH] vfio/iommu_type1: report the IOMMU aperture info Pierre Morel
  2017-11-30 12:57 ` Jean-Philippe Brucker
  2017-11-30 13:22 ` Auger Eric
@ 2017-11-30 14:08 ` Alex Williamson
  2017-11-30 15:11   ` Pierre Morel
  2 siblings, 1 reply; 15+ messages in thread
From: Alex Williamson @ 2017-11-30 14:08 UTC (permalink / raw)
  To: Pierre Morel; +Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On Thu, 30 Nov 2017 12:34:38 +0100
Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:

> When userland VFIO defines a new IOMMU for a guest it may
> want to specify to the guest the physical limits of
> the underlying host IOMMU to avoid access to forbidden
> memory ranges.
> 
> Currently, the vfio_iommu_type1 driver does not report this
> information to userland.
> 
> Let's extend the vfio_iommu_type1_info structure reported
> by the ioctl VFIO_IOMMU_GET_INFO command to report the
> IOMMU limits as new uint64_t entries aperture_start and
> aperture_end.
> 
> Let's also extend the flags bit map to add a flag specifying
> if this extension of the info structure is reported or not.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/vfio.h       |  3 +++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 8549cb1..7da5fe0 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>  	return ret;
>  }
>  
> +/**
> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
> + * @iommu: the current vfio_iommu
> + * @start: a pointer to the aperture start
> + * @end  : a pointer to the aperture end
> + *
> + * This function iterate on the domains using the given vfio_iommu
> + * and restrict the aperture to the minimal aperture common
> + * to all domains sharing this vfio_iommu.
> + */
> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
> +				uint64_t *end)
> +{
> +	struct iommu_domain_geometry geometry;
> +	struct vfio_domain *domain;
> +
> +	*start = 0;
> +	*end = U64_MAX;
> +
> +	mutex_lock(&iommu->lock);
> +	/* loop on all domains using this vfio_iommu */
> +	list_for_each_entry(domain, &iommu->domain_list, next) {
> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
> +					&geometry);
> +		if (geometry.force_aperture) {
> +			if (geometry.aperture_start > *start)
> +				*start = geometry.aperture_start;
> +			if (geometry.aperture_end < *end)
> +				*end = geometry.aperture_end;
> +		}
> +	}
> +	mutex_unlock(&iommu->lock);
> +}
> +
>  static long vfio_iommu_type1_ioctl(void *iommu_data,
>  				   unsigned int cmd, unsigned long arg)
>  {
> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  
>  		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>  
> +		minsz = min_t(size_t, info.argsz, sizeof(info));
> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
> +					 aperture_end)) {
> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
> +			vfio_get_aperture(iommu, &info.aperture_start,
> +					  &info.aperture_end);
> +		}
> +
>  		return copy_to_user((void __user *)arg, &info, minsz) ?
>  			-EFAULT : 0;
>  
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 0fb25fb..780d909 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
>  	__u32	flags;
>  #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
>  	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
> +	__u64   aperture_start;		/* start of DMA aperture */
> +	__u64   aperture_end;		/* end of DMA aperture */
>  };
>  
>  #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)

This only supports the most simple topology, even x86 cannot claim to
have a single contiguous aperture, it's typically bisected by an MSI
window.  I think we need an API that supports one or more apertures
out of the box.  Also as Eric indicates, a capability is probably the
better option for creating a flexible structure.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 12:57 ` Jean-Philippe Brucker
@ 2017-11-30 14:16   ` Pierre Morel
  2017-11-30 14:49     ` Pierre Morel
  2017-11-30 19:02   ` Alex Williamson
  1 sibling, 1 reply; 15+ messages in thread
From: Pierre Morel @ 2017-11-30 14:16 UTC (permalink / raw)
  To: Jean-Philippe Brucker, alex.williamson
  Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On 30/11/2017 13:57, Jean-Philippe Brucker wrote:
> Hello,
> 
> On 30/11/17 11:34, Pierre Morel wrote:
> [...]
>> +/**
>> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
>> + * @iommu: the current vfio_iommu
>> + * @start: a pointer to the aperture start
>> + * @end  : a pointer to the aperture end
>> + *
>> + * This function iterate on the domains using the given vfio_iommu
>> + * and restrict the aperture to the minimal aperture common
>> + * to all domains sharing this vfio_iommu.
>> + */
>> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
>> +				uint64_t *end)
>> +{
>> +	struct iommu_domain_geometry geometry;
>> +	struct vfio_domain *domain;
>> +
>> +	*start = 0;
>> +	*end = U64_MAX;
> 
> I wonder if the default values should also reflect what the VFIO
> implementation actually supports. Looking at vfio_dma_do_map, a 32-bit
> host will reject any iova greater than 32 bits. In addition,
> vfio_dma_do_unmap doesn't support unmapping the last page of a 64-bit
> address space (existing IOMMUs would probably reject map requests with
> IOVA > 52 bits anyway, but if they don't report a domain aperture, VFIO
> can't guess it).
> 
> I think it's convenient to use VFIO_IOMMU_UNMAP_DMA on the full address
> space when an unmap-all is needed, maybe we could provide default aperture
> values that help doing this? (~0U for 32-bit and (~0ULL - PAGE_SIZE) for
> 64-bit)
> 
> Thanks,
> Jean
> 

Thanks, I will take care of this.

Pierre


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 13:22 ` Auger Eric
@ 2017-11-30 14:17   ` Pierre Morel
  0 siblings, 0 replies; 15+ messages in thread
From: Pierre Morel @ 2017-11-30 14:17 UTC (permalink / raw)
  To: Auger Eric, alex.williamson
  Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On 30/11/2017 14:22, Auger Eric wrote:
> Hi Pierre,
> On 30/11/17 12:34, Pierre Morel wrote:
>> When userland VFIO defines a new IOMMU for a guest it may
>> want to specify to the guest the physical limits of
>> the underlying host IOMMU to avoid access to forbidden
>> memory ranges.
>>
>> Currently, the vfio_iommu_type1 driver does not report this
>> information to userland.
>>
>> Let's extend the vfio_iommu_type1_info structure reported
>> by the ioctl VFIO_IOMMU_GET_INFO command to report the
>> IOMMU limits as new uint64_t entries aperture_start and
>> aperture_end.
>>
>> Let's also extend the flags bit map to add a flag specifying
>> if this extension of the info structure is reported or not.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>> ---
>>   drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
>>   include/uapi/linux/vfio.h       |  3 +++
>>   2 files changed, 45 insertions(+)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 8549cb1..7da5fe0 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>   	return ret;
>>   }
>>   
>> +/**
>> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
>> + * @iommu: the current vfio_iommu
>> + * @start: a pointer to the aperture start
>> + * @end  : a pointer to the aperture end
>> + *
>> + * This function iterate on the domains using the given vfio_iommu
>> + * and restrict the aperture to the minimal aperture common
>> + * to all domains sharing this vfio_iommu.
>> + */
>> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
>> +				uint64_t *end)
>> +{
>> +	struct iommu_domain_geometry geometry;
>> +	struct vfio_domain *domain;
>> +
>> +	*start = 0;
>> +	*end = U64_MAX;
>> +
>> +	mutex_lock(&iommu->lock);
>> +	/* loop on all domains using this vfio_iommu */
>> +	list_for_each_entry(domain, &iommu->domain_list, next) {
>> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
>> +					&geometry);
>> +		if (geometry.force_aperture) {
>> +			if (geometry.aperture_start > *start)
>> +				*start = geometry.aperture_start;
>> +			if (geometry.aperture_end < *end)
>> +				*end = geometry.aperture_end;
>> +		}
>> +	}
>> +	mutex_unlock(&iommu->lock);
>> +}
>> +
>>   static long vfio_iommu_type1_ioctl(void *iommu_data,
>>   				   unsigned int cmd, unsigned long arg)
>>   {
>> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>>   
>>   		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>>   
>> +		minsz = min_t(size_t, info.argsz, sizeof(info));
>> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
>> +					 aperture_end)) {
>> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
>> +			vfio_get_aperture(iommu, &info.aperture_start,
>> +					  &info.aperture_end);
>> +		}
>> +
>>   		return copy_to_user((void __user *)arg, &info, minsz) ?
>>   			-EFAULT : 0;
>>   
>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>> index 0fb25fb..780d909 100644
>> --- a/include/uapi/linux/vfio.h
>> +++ b/include/uapi/linux/vfio.h
>> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
>>   	__u32	flags;
>>   #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
>>   	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
>> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
>> +	__u64   aperture_start;		/* start of DMA aperture */
>> +	__u64   aperture_end;		/* end of DMA aperture */
>>   };
>>   
>>   #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
>>
> 
> In the past I was told by Alex that using the capability chain
> extensions to add some new capabilities was a better approach:
> 
> https://lkml.org/lkml/2016/4/6/751
> 
> I did that kind of exercise in
> [PATCH v9 7/7] vfio/type1: return MSI geometry through
> VFIO_IOMMU_GET_INFO capability chains
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2016-May/016892.html
> 
> The patch was not used eventually but maybe that can help.
> 
> Thanks
> 
> Eric
> 

indeed interesting, I will go this way.
Thanks and also thanks for the pointers.

Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 14:16   ` Pierre Morel
@ 2017-11-30 14:49     ` Pierre Morel
  2017-11-30 15:23       ` Jean-Philippe Brucker
  0 siblings, 1 reply; 15+ messages in thread
From: Pierre Morel @ 2017-11-30 14:49 UTC (permalink / raw)
  To: Jean-Philippe Brucker, alex.williamson
  Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On 30/11/2017 15:16, Pierre Morel wrote:
> On 30/11/2017 13:57, Jean-Philippe Brucker wrote:
>> Hello,
>>
>> On 30/11/17 11:34, Pierre Morel wrote:
>> [...]
>>> +/**
>>> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
>>> + * @iommu: the current vfio_iommu
>>> + * @start: a pointer to the aperture start
>>> + * @end  : a pointer to the aperture end
>>> + *
>>> + * This function iterate on the domains using the given vfio_iommu
>>> + * and restrict the aperture to the minimal aperture common
>>> + * to all domains sharing this vfio_iommu.
>>> + */
>>> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t 
>>> *start,
>>> +                uint64_t *end)
>>> +{
>>> +    struct iommu_domain_geometry geometry;
>>> +    struct vfio_domain *domain;
>>> +
>>> +    *start = 0;
>>> +    *end = U64_MAX;
>>
>> I wonder if the default values should also reflect what the VFIO
>> implementation actually supports. Looking at vfio_dma_do_map, a 32-bit
>> host will reject any iova greater than 32 bits. In addition,
>> vfio_dma_do_unmap doesn't support unmapping the last page of a 64-bit
>> address space (existing IOMMUs would probably reject map requests with
>> IOVA > 52 bits anyway, but if they don't report a domain aperture, VFIO
>> can't guess it).
>>
>> I think it's convenient to use VFIO_IOMMU_UNMAP_DMA on the full address
>> space when an unmap-all is needed, maybe we could provide default 
>> aperture
>> values that help doing this? (~0U for 32-bit and (~0ULL - PAGE_SIZE) for
>> 64-bit)


indeed the error is to use uint64_t, I should *never* use this hardware 
specific values but simply unsigned long and -1UL/~0U at this level.

Shouldn't the 52bit problem be reported by the iommu geometry?


>>
>> Thanks,
>> Jean
>>
> 
> Thanks, I will take care of this.
> 
> Pierre
> 
> 


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 14:08 ` Alex Williamson
@ 2017-11-30 15:11   ` Pierre Morel
  2017-11-30 18:30     ` Alex Williamson
  0 siblings, 1 reply; 15+ messages in thread
From: Pierre Morel @ 2017-11-30 15:11 UTC (permalink / raw)
  To: Alex Williamson; +Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On 30/11/2017 15:08, Alex Williamson wrote:
> On Thu, 30 Nov 2017 12:34:38 +0100
> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
> 
>> When userland VFIO defines a new IOMMU for a guest it may
>> want to specify to the guest the physical limits of
>> the underlying host IOMMU to avoid access to forbidden
>> memory ranges.
>>
>> Currently, the vfio_iommu_type1 driver does not report this
>> information to userland.
>>
>> Let's extend the vfio_iommu_type1_info structure reported
>> by the ioctl VFIO_IOMMU_GET_INFO command to report the
>> IOMMU limits as new uint64_t entries aperture_start and
>> aperture_end.
>>
>> Let's also extend the flags bit map to add a flag specifying
>> if this extension of the info structure is reported or not.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>> ---
>>   drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
>>   include/uapi/linux/vfio.h       |  3 +++
>>   2 files changed, 45 insertions(+)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 8549cb1..7da5fe0 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>   	return ret;
>>   }
>>   
>> +/**
>> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
>> + * @iommu: the current vfio_iommu
>> + * @start: a pointer to the aperture start
>> + * @end  : a pointer to the aperture end
>> + *
>> + * This function iterate on the domains using the given vfio_iommu
>> + * and restrict the aperture to the minimal aperture common
>> + * to all domains sharing this vfio_iommu.
>> + */
>> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
>> +				uint64_t *end)
>> +{
>> +	struct iommu_domain_geometry geometry;
>> +	struct vfio_domain *domain;
>> +
>> +	*start = 0;
>> +	*end = U64_MAX;
>> +
>> +	mutex_lock(&iommu->lock);
>> +	/* loop on all domains using this vfio_iommu */
>> +	list_for_each_entry(domain, &iommu->domain_list, next) {
>> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
>> +					&geometry);
>> +		if (geometry.force_aperture) {
>> +			if (geometry.aperture_start > *start)
>> +				*start = geometry.aperture_start;
>> +			if (geometry.aperture_end < *end)
>> +				*end = geometry.aperture_end;
>> +		}
>> +	}
>> +	mutex_unlock(&iommu->lock);
>> +}
>> +
>>   static long vfio_iommu_type1_ioctl(void *iommu_data,
>>   				   unsigned int cmd, unsigned long arg)
>>   {
>> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>>   
>>   		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>>   
>> +		minsz = min_t(size_t, info.argsz, sizeof(info));
>> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
>> +					 aperture_end)) {
>> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
>> +			vfio_get_aperture(iommu, &info.aperture_start,
>> +					  &info.aperture_end);
>> +		}
>> +
>>   		return copy_to_user((void __user *)arg, &info, minsz) ?
>>   			-EFAULT : 0;
>>   
>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>> index 0fb25fb..780d909 100644
>> --- a/include/uapi/linux/vfio.h
>> +++ b/include/uapi/linux/vfio.h
>> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
>>   	__u32	flags;
>>   #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
>>   	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
>> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
>> +	__u64   aperture_start;		/* start of DMA aperture */
>> +	__u64   aperture_end;		/* end of DMA aperture */
>>   };
>>   
>>   #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
> 
> This only supports the most simple topology, even x86 cannot claim to
> have a single contiguous aperture, it's typically bisected by an MSI
> window.  I think we need an API that supports one or more apertures
> out of the box.  Also as Eric indicates, a capability is probably the
> better option for creating a flexible structure.  Thanks,
> 
> Alex
> 


Yes, I understand that a capability here is a must, I will follow this way.

For having multiple aperture and MSI protection, I understood it was 
done using windows and reserved regions.
Can you point me to my error?

Thanks,

Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 14:49     ` Pierre Morel
@ 2017-11-30 15:23       ` Jean-Philippe Brucker
  2017-12-01  8:50         ` Pierre Morel
  0 siblings, 1 reply; 15+ messages in thread
From: Jean-Philippe Brucker @ 2017-11-30 15:23 UTC (permalink / raw)
  To: Pierre Morel, alex.williamson
  Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On 30/11/17 14:49, Pierre Morel wrote:
[...]
> Shouldn't the 52bit problem be reported by the iommu geometry?

Yes, and most IOMMUs seem to set force_aperture. My comment was only about
default values to adopt in case a new IOMMU driver doesn't set
force_aperture to true.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 15:11   ` Pierre Morel
@ 2017-11-30 18:30     ` Alex Williamson
  2017-12-01  9:38       ` Pierre Morel
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Williamson @ 2017-11-30 18:30 UTC (permalink / raw)
  To: Pierre Morel; +Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On Thu, 30 Nov 2017 16:11:35 +0100
Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:

> On 30/11/2017 15:08, Alex Williamson wrote:
> > On Thu, 30 Nov 2017 12:34:38 +0100
> > Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
> >   
> >> When userland VFIO defines a new IOMMU for a guest it may
> >> want to specify to the guest the physical limits of
> >> the underlying host IOMMU to avoid access to forbidden
> >> memory ranges.
> >>
> >> Currently, the vfio_iommu_type1 driver does not report this
> >> information to userland.
> >>
> >> Let's extend the vfio_iommu_type1_info structure reported
> >> by the ioctl VFIO_IOMMU_GET_INFO command to report the
> >> IOMMU limits as new uint64_t entries aperture_start and
> >> aperture_end.
> >>
> >> Let's also extend the flags bit map to add a flag specifying
> >> if this extension of the info structure is reported or not.
> >>
> >> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >> ---
> >>   drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
> >>   include/uapi/linux/vfio.h       |  3 +++
> >>   2 files changed, 45 insertions(+)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index 8549cb1..7da5fe0 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> >>   	return ret;
> >>   }
> >>   
> >> +/**
> >> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
> >> + * @iommu: the current vfio_iommu
> >> + * @start: a pointer to the aperture start
> >> + * @end  : a pointer to the aperture end
> >> + *
> >> + * This function iterate on the domains using the given vfio_iommu
> >> + * and restrict the aperture to the minimal aperture common
> >> + * to all domains sharing this vfio_iommu.
> >> + */
> >> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
> >> +				uint64_t *end)
> >> +{
> >> +	struct iommu_domain_geometry geometry;
> >> +	struct vfio_domain *domain;
> >> +
> >> +	*start = 0;
> >> +	*end = U64_MAX;
> >> +
> >> +	mutex_lock(&iommu->lock);
> >> +	/* loop on all domains using this vfio_iommu */
> >> +	list_for_each_entry(domain, &iommu->domain_list, next) {
> >> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
> >> +					&geometry);
> >> +		if (geometry.force_aperture) {
> >> +			if (geometry.aperture_start > *start)
> >> +				*start = geometry.aperture_start;
> >> +			if (geometry.aperture_end < *end)
> >> +				*end = geometry.aperture_end;
> >> +		}
> >> +	}
> >> +	mutex_unlock(&iommu->lock);
> >> +}
> >> +
> >>   static long vfio_iommu_type1_ioctl(void *iommu_data,
> >>   				   unsigned int cmd, unsigned long arg)
> >>   {
> >> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
> >>   
> >>   		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
> >>   
> >> +		minsz = min_t(size_t, info.argsz, sizeof(info));
> >> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
> >> +					 aperture_end)) {
> >> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
> >> +			vfio_get_aperture(iommu, &info.aperture_start,
> >> +					  &info.aperture_end);
> >> +		}
> >> +
> >>   		return copy_to_user((void __user *)arg, &info, minsz) ?
> >>   			-EFAULT : 0;
> >>   
> >> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >> index 0fb25fb..780d909 100644
> >> --- a/include/uapi/linux/vfio.h
> >> +++ b/include/uapi/linux/vfio.h
> >> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
> >>   	__u32	flags;
> >>   #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
> >>   	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
> >> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
> >> +	__u64   aperture_start;		/* start of DMA aperture */
> >> +	__u64   aperture_end;		/* end of DMA aperture */
> >>   };
> >>   
> >>   #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)  
> > 
> > This only supports the most simple topology, even x86 cannot claim to
> > have a single contiguous aperture, it's typically bisected by an MSI
> > window.  I think we need an API that supports one or more apertures
> > out of the box.  Also as Eric indicates, a capability is probably the
> > better option for creating a flexible structure.  Thanks,
> > 
> > Alex
> >   
> 
> 
> Yes, I understand that a capability here is a must, I will follow this way.
> 
> For having multiple aperture and MSI protection, I understood it was 
> done using windows and reserved regions.
> Can you point me to my error?

See the thread from Huawei, I don't think that's a solved problem:

https://lists.gnu.org/archive/html/qemu-arm/2017-11/msg00237.html

If you want sysfs to be consumed separately by the user and fed into
new QEMU command line options for creating a VM layout, perhaps that's
sufficient, but I think the vfio api for the iommu should encompass
describing available ranges of mappable iova space without cobbling
together arbitrary info from sysfs.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 12:57 ` Jean-Philippe Brucker
  2017-11-30 14:16   ` Pierre Morel
@ 2017-11-30 19:02   ` Alex Williamson
  1 sibling, 0 replies; 15+ messages in thread
From: Alex Williamson @ 2017-11-30 19:02 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Pierre Morel, cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On Thu, 30 Nov 2017 12:57:56 +0000
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> wrote:

> Hello,
> 
> On 30/11/17 11:34, Pierre Morel wrote:
> [...]
> > +/**
> > + * vfio_get_aperture - report minimal aperture of a vfio_iommu
> > + * @iommu: the current vfio_iommu
> > + * @start: a pointer to the aperture start
> > + * @end  : a pointer to the aperture end
> > + *
> > + * This function iterate on the domains using the given vfio_iommu
> > + * and restrict the aperture to the minimal aperture common
> > + * to all domains sharing this vfio_iommu.
> > + */
> > +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
> > +				uint64_t *end)
> > +{
> > +	struct iommu_domain_geometry geometry;
> > +	struct vfio_domain *domain;
> > +
> > +	*start = 0;
> > +	*end = U64_MAX;  
> 
> I wonder if the default values should also reflect what the VFIO
> implementation actually supports. Looking at vfio_dma_do_map, a 32-bit
> host will reject any iova greater than 32 bits. In addition,
> vfio_dma_do_unmap doesn't support unmapping the last page of a 64-bit
> address space (existing IOMMUs would probably reject map requests with
> IOVA > 52 bits anyway, but if they don't report a domain aperture, VFIO
> can't guess it).
> 
> I think it's convenient to use VFIO_IOMMU_UNMAP_DMA on the full address
> space when an unmap-all is needed, maybe we could provide default aperture
> values that help doing this? (~0U for 32-bit and (~0ULL - PAGE_SIZE) for
> 64-bit)

Hmm, I'm not a huge fan of that.  Looks like there are two bugs, one
that I've known about but we can't fix[1] in vfio_iommu_type1_dma_unmap
using start and size rather than start and end, therefore the user can
only express a range of (U64_MAX - 1).  That's part of the API though,
live and learn.

The other is the check for iova wrap that doesn't consider the boundary
case.  Seems like this could simply be fixed.  In any case, I'd hate
for the apertures to impose that one-off if the hardware otherwise
doesn't.  Thanks,

Alex

[1] Really the vfio api is flexible enough to solve this, a flag bit in
vfio_iommu_type1_info could indicate support for an alternate map/unmap
mode where size becomes end and a flag bit on the map and unmap structs
could indicate use of that mode.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 15:23       ` Jean-Philippe Brucker
@ 2017-12-01  8:50         ` Pierre Morel
  0 siblings, 0 replies; 15+ messages in thread
From: Pierre Morel @ 2017-12-01  8:50 UTC (permalink / raw)
  To: Jean-Philippe Brucker, alex.williamson
  Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On 30/11/2017 16:23, Jean-Philippe Brucker wrote:
> On 30/11/17 14:49, Pierre Morel wrote:
> [...]
>> Shouldn't the 52bit problem be reported by the iommu geometry?
> 
> Yes, and most IOMMUs seem to set force_aperture. My comment was only about
> default values to adopt in case a new IOMMU driver doesn't set
> force_aperture to true.
> 
> Thanks,
> Jean
> 

Yes, clear, thanks.

Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-11-30 18:30     ` Alex Williamson
@ 2017-12-01  9:38       ` Pierre Morel
  2017-12-01 16:22         ` Alex Williamson
  0 siblings, 1 reply; 15+ messages in thread
From: Pierre Morel @ 2017-12-01  9:38 UTC (permalink / raw)
  To: Alex Williamson; +Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On 30/11/2017 19:30, Alex Williamson wrote:
> On Thu, 30 Nov 2017 16:11:35 +0100
> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
> 
>> On 30/11/2017 15:08, Alex Williamson wrote:
>>> On Thu, 30 Nov 2017 12:34:38 +0100
>>> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
>>>    
>>>> When userland VFIO defines a new IOMMU for a guest it may
>>>> want to specify to the guest the physical limits of
>>>> the underlying host IOMMU to avoid access to forbidden
>>>> memory ranges.
>>>>
>>>> Currently, the vfio_iommu_type1 driver does not report this
>>>> information to userland.
>>>>
>>>> Let's extend the vfio_iommu_type1_info structure reported
>>>> by the ioctl VFIO_IOMMU_GET_INFO command to report the
>>>> IOMMU limits as new uint64_t entries aperture_start and
>>>> aperture_end.
>>>>
>>>> Let's also extend the flags bit map to add a flag specifying
>>>> if this extension of the info structure is reported or not.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>> ---
>>>>    drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
>>>>    include/uapi/linux/vfio.h       |  3 +++
>>>>    2 files changed, 45 insertions(+)
>>>>
>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>>>> index 8549cb1..7da5fe0 100644
>>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>>> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>>>    	return ret;
>>>>    }
>>>>    
>>>> +/**
>>>> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
>>>> + * @iommu: the current vfio_iommu
>>>> + * @start: a pointer to the aperture start
>>>> + * @end  : a pointer to the aperture end
>>>> + *
>>>> + * This function iterate on the domains using the given vfio_iommu
>>>> + * and restrict the aperture to the minimal aperture common
>>>> + * to all domains sharing this vfio_iommu.
>>>> + */
>>>> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
>>>> +				uint64_t *end)
>>>> +{
>>>> +	struct iommu_domain_geometry geometry;
>>>> +	struct vfio_domain *domain;
>>>> +
>>>> +	*start = 0;
>>>> +	*end = U64_MAX;
>>>> +
>>>> +	mutex_lock(&iommu->lock);
>>>> +	/* loop on all domains using this vfio_iommu */
>>>> +	list_for_each_entry(domain, &iommu->domain_list, next) {
>>>> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
>>>> +					&geometry);
>>>> +		if (geometry.force_aperture) {
>>>> +			if (geometry.aperture_start > *start)
>>>> +				*start = geometry.aperture_start;
>>>> +			if (geometry.aperture_end < *end)
>>>> +				*end = geometry.aperture_end;
>>>> +		}
>>>> +	}
>>>> +	mutex_unlock(&iommu->lock);
>>>> +}
>>>> +
>>>>    static long vfio_iommu_type1_ioctl(void *iommu_data,
>>>>    				   unsigned int cmd, unsigned long arg)
>>>>    {
>>>> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>>>>    
>>>>    		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>>>>    
>>>> +		minsz = min_t(size_t, info.argsz, sizeof(info));
>>>> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
>>>> +					 aperture_end)) {
>>>> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
>>>> +			vfio_get_aperture(iommu, &info.aperture_start,
>>>> +					  &info.aperture_end);
>>>> +		}
>>>> +
>>>>    		return copy_to_user((void __user *)arg, &info, minsz) ?
>>>>    			-EFAULT : 0;
>>>>    
>>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>>>> index 0fb25fb..780d909 100644
>>>> --- a/include/uapi/linux/vfio.h
>>>> +++ b/include/uapi/linux/vfio.h
>>>> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
>>>>    	__u32	flags;
>>>>    #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
>>>>    	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
>>>> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
>>>> +	__u64   aperture_start;		/* start of DMA aperture */
>>>> +	__u64   aperture_end;		/* end of DMA aperture */
>>>>    };
>>>>    
>>>>    #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
>>>
>>> This only supports the most simple topology, even x86 cannot claim to
>>> have a single contiguous aperture, it's typically bisected by an MSI
>>> window.  I think we need an API that supports one or more apertures
>>> out of the box.  Also as Eric indicates, a capability is probably the
>>> better option for creating a flexible structure.  Thanks,
>>>
>>> Alex
>>>    
>>
>>
>> Yes, I understand that a capability here is a must, I will follow this way.
>>
>> For having multiple aperture and MSI protection, I understood it was
>> done using windows and reserved regions.
>> Can you point me to my error?
> 
> See the thread from Huawei, I don't think that's a solved problem:
> 
> https://lists.gnu.org/archive/html/qemu-arm/2017-11/msg00237.html
> 
> If you want sysfs to be consumed separately by the user and fed into
> new QEMU command line options for creating a VM layout, perhaps that's
> sufficient, but I think the vfio api for the iommu should encompass
> describing available ranges of mappable iova space without cobbling
> together arbitrary info from sysfs.  Thanks,
> 
> Alex
> 

Hi Alex,

I resume to see if I understood you well:

We may have physical IOMMUs with a more complex access that can not be 
specified by only defining the start and end of a read/write region.

Windows can be used to reserve regions for the VM but it is not what we 
want. What we want is to know what the host can offer which is a mix of 
aperture and windows.

To report this we can use capabilities in a positive way, describing 
what the host offers not what it can not provide.

To achieve this we have to use two interfaces:
- VFIO user interface with VFIO_IOMMU_GET_INFO and capabilities
- Physical IOMMU interface with both geometry and window iommu_ops 
callbacks.

If it is sufficiently near from what you thought I will provide a new 
version in this direction.

Regards,

Pierre



-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-12-01  9:38       ` Pierre Morel
@ 2017-12-01 16:22         ` Alex Williamson
  2017-12-04 12:36           ` Pierre Morel
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Williamson @ 2017-12-01 16:22 UTC (permalink / raw)
  To: Pierre Morel; +Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On Fri, 1 Dec 2017 10:38:07 +0100
Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:

> On 30/11/2017 19:30, Alex Williamson wrote:
> > On Thu, 30 Nov 2017 16:11:35 +0100
> > Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
> >   
> >> On 30/11/2017 15:08, Alex Williamson wrote:  
> >>> On Thu, 30 Nov 2017 12:34:38 +0100
> >>> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
> >>>      
> >>>> When userland VFIO defines a new IOMMU for a guest it may
> >>>> want to specify to the guest the physical limits of
> >>>> the underlying host IOMMU to avoid access to forbidden
> >>>> memory ranges.
> >>>>
> >>>> Currently, the vfio_iommu_type1 driver does not report this
> >>>> information to userland.
> >>>>
> >>>> Let's extend the vfio_iommu_type1_info structure reported
> >>>> by the ioctl VFIO_IOMMU_GET_INFO command to report the
> >>>> IOMMU limits as new uint64_t entries aperture_start and
> >>>> aperture_end.
> >>>>
> >>>> Let's also extend the flags bit map to add a flag specifying
> >>>> if this extension of the info structure is reported or not.
> >>>>
> >>>> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >>>> ---
> >>>>    drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
> >>>>    include/uapi/linux/vfio.h       |  3 +++
> >>>>    2 files changed, 45 insertions(+)
> >>>>
> >>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >>>> index 8549cb1..7da5fe0 100644
> >>>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>>> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> >>>>    	return ret;
> >>>>    }
> >>>>    
> >>>> +/**
> >>>> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
> >>>> + * @iommu: the current vfio_iommu
> >>>> + * @start: a pointer to the aperture start
> >>>> + * @end  : a pointer to the aperture end
> >>>> + *
> >>>> + * This function iterate on the domains using the given vfio_iommu
> >>>> + * and restrict the aperture to the minimal aperture common
> >>>> + * to all domains sharing this vfio_iommu.
> >>>> + */
> >>>> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
> >>>> +				uint64_t *end)
> >>>> +{
> >>>> +	struct iommu_domain_geometry geometry;
> >>>> +	struct vfio_domain *domain;
> >>>> +
> >>>> +	*start = 0;
> >>>> +	*end = U64_MAX;
> >>>> +
> >>>> +	mutex_lock(&iommu->lock);
> >>>> +	/* loop on all domains using this vfio_iommu */
> >>>> +	list_for_each_entry(domain, &iommu->domain_list, next) {
> >>>> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
> >>>> +					&geometry);
> >>>> +		if (geometry.force_aperture) {
> >>>> +			if (geometry.aperture_start > *start)
> >>>> +				*start = geometry.aperture_start;
> >>>> +			if (geometry.aperture_end < *end)
> >>>> +				*end = geometry.aperture_end;
> >>>> +		}
> >>>> +	}
> >>>> +	mutex_unlock(&iommu->lock);
> >>>> +}
> >>>> +
> >>>>    static long vfio_iommu_type1_ioctl(void *iommu_data,
> >>>>    				   unsigned int cmd, unsigned long arg)
> >>>>    {
> >>>> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
> >>>>    
> >>>>    		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
> >>>>    
> >>>> +		minsz = min_t(size_t, info.argsz, sizeof(info));
> >>>> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
> >>>> +					 aperture_end)) {
> >>>> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
> >>>> +			vfio_get_aperture(iommu, &info.aperture_start,
> >>>> +					  &info.aperture_end);
> >>>> +		}
> >>>> +
> >>>>    		return copy_to_user((void __user *)arg, &info, minsz) ?
> >>>>    			-EFAULT : 0;
> >>>>    
> >>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >>>> index 0fb25fb..780d909 100644
> >>>> --- a/include/uapi/linux/vfio.h
> >>>> +++ b/include/uapi/linux/vfio.h
> >>>> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
> >>>>    	__u32	flags;
> >>>>    #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
> >>>>    	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
> >>>> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
> >>>> +	__u64   aperture_start;		/* start of DMA aperture */
> >>>> +	__u64   aperture_end;		/* end of DMA aperture */
> >>>>    };
> >>>>    
> >>>>    #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)  
> >>>
> >>> This only supports the most simple topology, even x86 cannot claim to
> >>> have a single contiguous aperture, it's typically bisected by an MSI
> >>> window.  I think we need an API that supports one or more apertures
> >>> out of the box.  Also as Eric indicates, a capability is probably the
> >>> better option for creating a flexible structure.  Thanks,
> >>>
> >>> Alex
> >>>      
> >>
> >>
> >> Yes, I understand that a capability here is a must, I will follow this way.
> >>
> >> For having multiple aperture and MSI protection, I understood it was
> >> done using windows and reserved regions.
> >> Can you point me to my error?  
> > 
> > See the thread from Huawei, I don't think that's a solved problem:
> > 
> > https://lists.gnu.org/archive/html/qemu-arm/2017-11/msg00237.html
> > 
> > If you want sysfs to be consumed separately by the user and fed into
> > new QEMU command line options for creating a VM layout, perhaps that's
> > sufficient, but I think the vfio api for the iommu should encompass
> > describing available ranges of mappable iova space without cobbling
> > together arbitrary info from sysfs.  Thanks,
> > 
> > Alex
> >   
> 
> Hi Alex,
> 
> I resume to see if I understood you well:
> 
> We may have physical IOMMUs with a more complex access that can not be 
> specified by only defining the start and end of a read/write region.
> 
> Windows can be used to reserve regions for the VM but it is not what we 
> want. What we want is to know what the host can offer which is a mix of 
> aperture and windows.
> 
> To report this we can use capabilities in a positive way, describing 
> what the host offers not what it can not provide.
> 
> To achieve this we have to use two interfaces:
> - VFIO user interface with VFIO_IOMMU_GET_INFO and capabilities
> - Physical IOMMU interface with both geometry and window iommu_ops 
> callbacks.
> 
> If it is sufficiently near from what you thought I will provide a new 
> version in this direction.

I believe so.  VFIO would construct a set of mappable iova
regions/windows using information provided via the IOMMU API via
iommu_ops and expose this via a new capability supporting multiple such
regions via the VFIO_IOMMU_GET_INFO ioctl.  This ioctl would be
extended to support capabilities in the same way we've done so for
other vfio ioctls.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info
  2017-12-01 16:22         ` Alex Williamson
@ 2017-12-04 12:36           ` Pierre Morel
  0 siblings, 0 replies; 15+ messages in thread
From: Pierre Morel @ 2017-12-04 12:36 UTC (permalink / raw)
  To: Alex Williamson; +Cc: cohuck, borntraeger, zyimin, pasic, kvm, linux-kernel

On 01/12/2017 17:22, Alex Williamson wrote:
> On Fri, 1 Dec 2017 10:38:07 +0100
> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
> 
>> On 30/11/2017 19:30, Alex Williamson wrote:
>>> On Thu, 30 Nov 2017 16:11:35 +0100
>>> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
>>>    
>>>> On 30/11/2017 15:08, Alex Williamson wrote:
>>>>> On Thu, 30 Nov 2017 12:34:38 +0100
>>>>> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
>>>>>       
>>>>>> When userland VFIO defines a new IOMMU for a guest it may
>>>>>> want to specify to the guest the physical limits of
>>>>>> the underlying host IOMMU to avoid access to forbidden
>>>>>> memory ranges.
>>>>>>
>>>>>> Currently, the vfio_iommu_type1 driver does not report this
>>>>>> information to userland.
>>>>>>
>>>>>> Let's extend the vfio_iommu_type1_info structure reported
>>>>>> by the ioctl VFIO_IOMMU_GET_INFO command to report the
>>>>>> IOMMU limits as new uint64_t entries aperture_start and
>>>>>> aperture_end.
>>>>>>
>>>>>> Let's also extend the flags bit map to add a flag specifying
>>>>>> if this extension of the info structure is reported or not.
>>>>>>
>>>>>> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>     drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
>>>>>>     include/uapi/linux/vfio.h       |  3 +++
>>>>>>     2 files changed, 45 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>>>>>> index 8549cb1..7da5fe0 100644
>>>>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>>>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>>>>> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>>>>>     	return ret;
>>>>>>     }
>>>>>>     
>>>>>> +/**
>>>>>> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
>>>>>> + * @iommu: the current vfio_iommu
>>>>>> + * @start: a pointer to the aperture start
>>>>>> + * @end  : a pointer to the aperture end
>>>>>> + *
>>>>>> + * This function iterate on the domains using the given vfio_iommu
>>>>>> + * and restrict the aperture to the minimal aperture common
>>>>>> + * to all domains sharing this vfio_iommu.
>>>>>> + */
>>>>>> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
>>>>>> +				uint64_t *end)
>>>>>> +{
>>>>>> +	struct iommu_domain_geometry geometry;
>>>>>> +	struct vfio_domain *domain;
>>>>>> +
>>>>>> +	*start = 0;
>>>>>> +	*end = U64_MAX;
>>>>>> +
>>>>>> +	mutex_lock(&iommu->lock);
>>>>>> +	/* loop on all domains using this vfio_iommu */
>>>>>> +	list_for_each_entry(domain, &iommu->domain_list, next) {
>>>>>> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
>>>>>> +					&geometry);
>>>>>> +		if (geometry.force_aperture) {
>>>>>> +			if (geometry.aperture_start > *start)
>>>>>> +				*start = geometry.aperture_start;
>>>>>> +			if (geometry.aperture_end < *end)
>>>>>> +				*end = geometry.aperture_end;
>>>>>> +		}
>>>>>> +	}
>>>>>> +	mutex_unlock(&iommu->lock);
>>>>>> +}
>>>>>> +
>>>>>>     static long vfio_iommu_type1_ioctl(void *iommu_data,
>>>>>>     				   unsigned int cmd, unsigned long arg)
>>>>>>     {
>>>>>> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>>>>>>     
>>>>>>     		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>>>>>>     
>>>>>> +		minsz = min_t(size_t, info.argsz, sizeof(info));
>>>>>> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
>>>>>> +					 aperture_end)) {
>>>>>> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
>>>>>> +			vfio_get_aperture(iommu, &info.aperture_start,
>>>>>> +					  &info.aperture_end);
>>>>>> +		}
>>>>>> +
>>>>>>     		return copy_to_user((void __user *)arg, &info, minsz) ?
>>>>>>     			-EFAULT : 0;
>>>>>>     
>>>>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>>>>>> index 0fb25fb..780d909 100644
>>>>>> --- a/include/uapi/linux/vfio.h
>>>>>> +++ b/include/uapi/linux/vfio.h
>>>>>> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
>>>>>>     	__u32	flags;
>>>>>>     #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
>>>>>>     	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
>>>>>> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
>>>>>> +	__u64   aperture_start;		/* start of DMA aperture */
>>>>>> +	__u64   aperture_end;		/* end of DMA aperture */
>>>>>>     };
>>>>>>     
>>>>>>     #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
>>>>>
>>>>> This only supports the most simple topology, even x86 cannot claim to
>>>>> have a single contiguous aperture, it's typically bisected by an MSI
>>>>> window.  I think we need an API that supports one or more apertures
>>>>> out of the box.  Also as Eric indicates, a capability is probably the
>>>>> better option for creating a flexible structure.  Thanks,
>>>>>
>>>>> Alex
>>>>>       
>>>>
>>>>
>>>> Yes, I understand that a capability here is a must, I will follow this way.
>>>>
>>>> For having multiple aperture and MSI protection, I understood it was
>>>> done using windows and reserved regions.
>>>> Can you point me to my error?
>>>
>>> See the thread from Huawei, I don't think that's a solved problem:
>>>
>>> https://lists.gnu.org/archive/html/qemu-arm/2017-11/msg00237.html
>>>
>>> If you want sysfs to be consumed separately by the user and fed into
>>> new QEMU command line options for creating a VM layout, perhaps that's
>>> sufficient, but I think the vfio api for the iommu should encompass
>>> describing available ranges of mappable iova space without cobbling
>>> together arbitrary info from sysfs.  Thanks,
>>>
>>> Alex
>>>    
>>
>> Hi Alex,
>>
>> I resume to see if I understood you well:
>>
>> We may have physical IOMMUs with a more complex access that can not be
>> specified by only defining the start and end of a read/write region.
>>
>> Windows can be used to reserve regions for the VM but it is not what we
>> want. What we want is to know what the host can offer which is a mix of
>> aperture and windows.
>>
>> To report this we can use capabilities in a positive way, describing
>> what the host offers not what it can not provide.
>>
>> To achieve this we have to use two interfaces:
>> - VFIO user interface with VFIO_IOMMU_GET_INFO and capabilities
>> - Physical IOMMU interface with both geometry and window iommu_ops
>> callbacks.
>>
>> If it is sufficiently near from what you thought I will provide a new
>> version in this direction.
> 
> I believe so.  VFIO would construct a set of mappable iova
> regions/windows using information provided via the IOMMU API via
> iommu_ops and expose this via a new capability supporting multiple such
> regions via the VFIO_IOMMU_GET_INFO ioctl.  This ioctl would be
> extended to support capabilities in the same way we've done so for
> other vfio ioctls.  Thanks,
> 
> Alex
> 

Hi Alex,

Thanks, I go this way.

Pierre


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-12-04 12:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-30 11:34 [PATCH] vfio/iommu_type1: report the IOMMU aperture info Pierre Morel
2017-11-30 12:57 ` Jean-Philippe Brucker
2017-11-30 14:16   ` Pierre Morel
2017-11-30 14:49     ` Pierre Morel
2017-11-30 15:23       ` Jean-Philippe Brucker
2017-12-01  8:50         ` Pierre Morel
2017-11-30 19:02   ` Alex Williamson
2017-11-30 13:22 ` Auger Eric
2017-11-30 14:17   ` Pierre Morel
2017-11-30 14:08 ` Alex Williamson
2017-11-30 15:11   ` Pierre Morel
2017-11-30 18:30     ` Alex Williamson
2017-12-01  9:38       ` Pierre Morel
2017-12-01 16:22         ` Alex Williamson
2017-12-04 12:36           ` Pierre Morel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).