All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
       [not found] ` <1519280641-30258-3-git-send-email-xieyisheng1@huawei.com>
       [not found]   ` <1519280641-30258-3-git-send-email-xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2018-02-22  9:03       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-02-22  9:03 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, robh-DgEjT+Ai2ygdnm+yROfE0A,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA, Catalin Marinas,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rjw-LthD3rsA81gm4RdzfppkhA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Sudeep Holla,
	christian.koenig-5C7GfCeVMHo

Hi Jean,

On 2018/2/22 14:23, Jean-Philippe Brucker wrote:
> @@ -129,7 +439,10 @@ int iommu_sva_device_shutdown(struct device *dev)
>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  			  unsigned long flags, void *drvdata)
>  {
> +	int i, ret;
> +	struct io_mm *io_mm = NULL;
>  	struct iommu_domain *domain;
> +	struct iommu_bond *bond = NULL, *tmp;
>  	struct iommu_param *dev_param = dev->iommu_param;
>  
>  	domain = iommu_get_domain_for_dev(dev);
> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>  		return -EINVAL;
>  
> -	return -ENOSYS; /* TODO */
> +	/* If an io_mm already exists, use it */
> +	spin_lock(&iommu_sva_lock);
> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
> +			continue;
> +
> +		/* Is it already bound to this device? */
> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
> +			if (tmp->dev != dev)
> +				continue;
> +
> +			bond = tmp;
> +			refcount_inc(&bond->refs);
> +			io_mm_put_locked(io_mm);

Should io_mm->pasid still be set to *pasid when the device already bond? so driver can
always get the right pasid if it bond to a mm multi-times, without keeping the pasid itself?

Thanks
Yisheng
> +			break;
> +		}
> +		break;
> +	}
> +	spin_unlock(&iommu_sva_lock);
> +
> +	if (bond)
> +		return 0;
> +
> +	if (!io_mm) {
> +		io_mm = io_mm_alloc(domain, dev, mm);
> +		if (IS_ERR(io_mm))
> +			return PTR_ERR(io_mm);
> +	}
> +
> +	ret = io_mm_attach(domain, dev, io_mm, drvdata);
> +	if (ret)
> +		io_mm_put(io_mm);
> +	else
> +		*pasid = io_mm->pasid;
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_bind_device);

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-02-22  9:03       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-02-22  9:03 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: linux-arm-kernel, linux-pci, linux-acpi, devicetree, iommu, kvm,
	joro, Mark Rutland, Catalin Marinas, Will Deacon,
	Lorenzo Pieralisi, hanjun.guo, Sudeep Holla, rjw, lenb,
	Robin Murphy, bhelgaas, alex.williamson, tn, liubo95,
	thunder.leizhen, xuzaibo, ilias.apalodimas, jonathan.cameron,
	shunyong.yang, nwatters, okaya, jcrouse, rfranz, dwmw2,
	jacob.jun.pan, yi.l.liu, ashok.raj, robdclark, christian.koenig,
	bharatku, robh

Hi Jean,

On 2018/2/22 14:23, Jean-Philippe Brucker wrote:
> @@ -129,7 +439,10 @@ int iommu_sva_device_shutdown(struct device *dev)
>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  			  unsigned long flags, void *drvdata)
>  {
> +	int i, ret;
> +	struct io_mm *io_mm = NULL;
>  	struct iommu_domain *domain;
> +	struct iommu_bond *bond = NULL, *tmp;
>  	struct iommu_param *dev_param = dev->iommu_param;
>  
>  	domain = iommu_get_domain_for_dev(dev);
> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>  		return -EINVAL;
>  
> -	return -ENOSYS; /* TODO */
> +	/* If an io_mm already exists, use it */
> +	spin_lock(&iommu_sva_lock);
> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
> +			continue;
> +
> +		/* Is it already bound to this device? */
> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
> +			if (tmp->dev != dev)
> +				continue;
> +
> +			bond = tmp;
> +			refcount_inc(&bond->refs);
> +			io_mm_put_locked(io_mm);

Should io_mm->pasid still be set to *pasid when the device already bond? so driver can
always get the right pasid if it bond to a mm multi-times, without keeping the pasid itself?

Thanks
Yisheng
> +			break;
> +		}
> +		break;
> +	}
> +	spin_unlock(&iommu_sva_lock);
> +
> +	if (bond)
> +		return 0;
> +
> +	if (!io_mm) {
> +		io_mm = io_mm_alloc(domain, dev, mm);
> +		if (IS_ERR(io_mm))
> +			return PTR_ERR(io_mm);
> +	}
> +
> +	ret = io_mm_attach(domain, dev, io_mm, drvdata);
> +	if (ret)
> +		io_mm_put(io_mm);
> +	else
> +		*pasid = io_mm->pasid;
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_bind_device);


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-02-22  9:03       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-02-22  9:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jean,

On 2018/2/22 14:23, Jean-Philippe Brucker wrote:
> @@ -129,7 +439,10 @@ int iommu_sva_device_shutdown(struct device *dev)
>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  			  unsigned long flags, void *drvdata)
>  {
> +	int i, ret;
> +	struct io_mm *io_mm = NULL;
>  	struct iommu_domain *domain;
> +	struct iommu_bond *bond = NULL, *tmp;
>  	struct iommu_param *dev_param = dev->iommu_param;
>  
>  	domain = iommu_get_domain_for_dev(dev);
> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>  		return -EINVAL;
>  
> -	return -ENOSYS; /* TODO */
> +	/* If an io_mm already exists, use it */
> +	spin_lock(&iommu_sva_lock);
> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
> +			continue;
> +
> +		/* Is it already bound to this device? */
> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
> +			if (tmp->dev != dev)
> +				continue;
> +
> +			bond = tmp;
> +			refcount_inc(&bond->refs);
> +			io_mm_put_locked(io_mm);

Should io_mm->pasid still be set to *pasid when the device already bond? so driver can
always get the right pasid if it bond to a mm multi-times, without keeping the pasid itself?

Thanks
Yisheng
> +			break;
> +		}
> +		break;
> +	}
> +	spin_unlock(&iommu_sva_lock);
> +
> +	if (bond)
> +		return 0;
> +
> +	if (!io_mm) {
> +		io_mm = io_mm_alloc(domain, dev, mm);
> +		if (IS_ERR(io_mm))
> +			return PTR_ERR(io_mm);
> +	}
> +
> +	ret = io_mm_attach(domain, dev, io_mm, drvdata);
> +	if (ret)
> +		io_mm_put(io_mm);
> +	else
> +		*pasid = io_mm->pasid;
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_bind_device);

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-02-22  9:03       ` Yisheng Xie
  (?)
@ 2018-02-22 13:17           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-02-22 13:17 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, robh-DgEjT+Ai2ygdnm+yROfE0A,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA, Catalin Marinas,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rjw-LthD3rsA81gHJAlzAAQw8A

On 22/02/18 09:03, Yisheng Xie wrote:
[...]
>> +		/* Is it already bound to this device? */
>> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
>> +			if (tmp->dev != dev)
>> +				continue;
>> +
>> +			bond = tmp;
>> +			refcount_inc(&bond->refs);
>> +			io_mm_put_locked(io_mm);
> 
> Should io_mm->pasid still be set to *pasid when the device already bond? so driver can
> always get the right pasid if it bond to a mm multi-times, without keeping the pasid itself?

(Assuming you mean set *pasid = io_mm->pasid) I think it should, I seem to
have removed it by accident from this version

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-02-22 13:17           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-02-22 13:17 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Mark Rutland, ilias.apalodimas, kvm, linux-pci, xuzaibo,
	jonathan.cameron, Will Deacon, okaya, robh, Lorenzo Pieralisi,
	ashok.raj, tn, joro, robdclark, bharatku, linux-acpi,
	Catalin Marinas, rfranz, lenb, devicetree, jacob.jun.pan,
	alex.williamson, yi.l.liu, thunder.leizhen, bhelgaas,
	linux-arm-kernel, shunyong.yang, dwmw2, liubo95, rjw, jcrouse,
	iommu, hanjun.guo, Sudeep Holla, Robin Murphy, christian.koenig,
	nwatters

On 22/02/18 09:03, Yisheng Xie wrote:
[...]
>> +		/* Is it already bound to this device? */
>> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
>> +			if (tmp->dev != dev)
>> +				continue;
>> +
>> +			bond = tmp;
>> +			refcount_inc(&bond->refs);
>> +			io_mm_put_locked(io_mm);
> 
> Should io_mm->pasid still be set to *pasid when the device already bond? so driver can
> always get the right pasid if it bond to a mm multi-times, without keeping the pasid itself?

(Assuming you mean set *pasid = io_mm->pasid) I think it should, I seem to
have removed it by accident from this version

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-02-22 13:17           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-02-22 13:17 UTC (permalink / raw)
  To: linux-arm-kernel

On 22/02/18 09:03, Yisheng Xie wrote:
[...]
>> +		/* Is it already bound to this device? */
>> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
>> +			if (tmp->dev != dev)
>> +				continue;
>> +
>> +			bond = tmp;
>> +			refcount_inc(&bond->refs);
>> +			io_mm_put_locked(io_mm);
> 
> Should io_mm->pasid still be set to *pasid when the device already bond? so driver can
> always get the right pasid if it bond to a mm multi-times, without keeping the pasid itself?

(Assuming you mean set *pasid = io_mm->pasid) I think it should, I seem to
have removed it by accident from this version

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 37/37] vfio: Add support for Shared Virtual Addressing
       [not found] ` <1519280641-30258-37-git-send-email-xieyisheng1@huawei.com>
       [not found]   ` <1519280641-30258-37-git-send-email-xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2018-03-19  9:47       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-19  9:47 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, robh-DgEjT+Ai2ygdnm+yROfE0A,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA, Catalin Marinas,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rjw-LthD3rsA81gm4RdzfppkhA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Sudeep Holla,
	christian.koenig-5C7GfCeVMHo

Hi Jean,

vfio can be compiled as module, however you use some functions which are not
exported.

comment inline:

[...]
> Add two new ioctl for VFIO containers. VFIO_IOMMU_BIND_PROCESS creates a
> bond between a container and a process address space, identified by a
> device-specific ID named PASID. This allows the device to target DMA
> transactions at the process virtual addresses without a need for mapping
> and unmapping buffers explicitly in the IOMMU. The process page tables are
> shared with the IOMMU, and mechanisms such as PCI ATS/PRI are used to
> handle faults. VFIO_IOMMU_UNBIND_PROCESS removes a bond created with
> VFIO_IOMMU_BIND_PROCESS.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
> ---
[...]
> +static struct mm_struct *vfio_iommu_get_mm_by_vpid(pid_t vpid)
> +{
> +	struct mm_struct *mm;
> +	struct task_struct *task;
> +
> +	rcu_read_lock();
> +	task = find_task_by_vpid(vpid);

Maybe can use?
	task = pid_task(find_vpid(params.vpid), PIDTYPE_PID)

> +	if (task)
> +		get_task_struct(task);
> +	rcu_read_unlock();
> +	if (!task)
> +		return ERR_PTR(-ESRCH);
> +
> +	/* Ensure that current has RW access on the mm */
> +	mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);

You will try to export mm_access, I find Felix have tried to, but seems give up:

 https://patchwork.kernel.org/patch/9744281/

Thanks
Yisheng

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 37/37] vfio: Add support for Shared Virtual Addressing
@ 2018-03-19  9:47       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-19  9:47 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: linux-arm-kernel, linux-pci, linux-acpi, devicetree, iommu, kvm,
	joro, Mark Rutland, Catalin Marinas, Will Deacon,
	Lorenzo Pieralisi, hanjun.guo, Sudeep Holla, rjw, lenb,
	Robin Murphy, bhelgaas, alex.williamson, tn, liubo95,
	thunder.leizhen, xuzaibo, ilias.apalodimas, jonathan.cameron,
	shunyong.yang, nwatters, okaya, jcrouse, rfranz, dwmw2,
	jacob.jun.pan, yi.l.liu, ashok.raj, robdclark, christian.koenig,
	bharatku, robh

Hi Jean,

vfio can be compiled as module, however you use some functions which are not
exported.

comment inline:

[...]
> Add two new ioctl for VFIO containers. VFIO_IOMMU_BIND_PROCESS creates a
> bond between a container and a process address space, identified by a
> device-specific ID named PASID. This allows the device to target DMA
> transactions at the process virtual addresses without a need for mapping
> and unmapping buffers explicitly in the IOMMU. The process page tables are
> shared with the IOMMU, and mechanisms such as PCI ATS/PRI are used to
> handle faults. VFIO_IOMMU_UNBIND_PROCESS removes a bond created with
> VFIO_IOMMU_BIND_PROCESS.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> ---
[...]
> +static struct mm_struct *vfio_iommu_get_mm_by_vpid(pid_t vpid)
> +{
> +	struct mm_struct *mm;
> +	struct task_struct *task;
> +
> +	rcu_read_lock();
> +	task = find_task_by_vpid(vpid);

Maybe can use?
	task = pid_task(find_vpid(params.vpid), PIDTYPE_PID)

> +	if (task)
> +		get_task_struct(task);
> +	rcu_read_unlock();
> +	if (!task)
> +		return ERR_PTR(-ESRCH);
> +
> +	/* Ensure that current has RW access on the mm */
> +	mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);

You will try to export mm_access, I find Felix have tried to, but seems give up:

 https://patchwork.kernel.org/patch/9744281/

Thanks
Yisheng


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 37/37] vfio: Add support for Shared Virtual Addressing
@ 2018-03-19  9:47       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-19  9:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jean,

vfio can be compiled as module, however you use some functions which are not
exported.

comment inline:

[...]
> Add two new ioctl for VFIO containers. VFIO_IOMMU_BIND_PROCESS creates a
> bond between a container and a process address space, identified by a
> device-specific ID named PASID. This allows the device to target DMA
> transactions at the process virtual addresses without a need for mapping
> and unmapping buffers explicitly in the IOMMU. The process page tables are
> shared with the IOMMU, and mechanisms such as PCI ATS/PRI are used to
> handle faults. VFIO_IOMMU_UNBIND_PROCESS removes a bond created with
> VFIO_IOMMU_BIND_PROCESS.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> ---
[...]
> +static struct mm_struct *vfio_iommu_get_mm_by_vpid(pid_t vpid)
> +{
> +	struct mm_struct *mm;
> +	struct task_struct *task;
> +
> +	rcu_read_lock();
> +	task = find_task_by_vpid(vpid);

Maybe can use?
	task = pid_task(find_vpid(params.vpid), PIDTYPE_PID)

> +	if (task)
> +		get_task_struct(task);
> +	rcu_read_unlock();
> +	if (!task)
> +		return ERR_PTR(-ESRCH);
> +
> +	/* Ensure that current has RW access on the mm */
> +	mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);

You will try to export mm_access, I find Felix have tried to, but seems give up:

 https://patchwork.kernel.org/patch/9744281/

Thanks
Yisheng

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
       [not found] ` <1519280641-30258-27-git-send-email-xieyisheng1@huawei.com>
       [not found]   ` <1519280641-30258-27-git-send-email-xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2018-03-19 11:03       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-19 11:03 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, robh-DgEjT+Ai2ygdnm+yROfE0A,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA, Catalin Marinas,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rjw-LthD3rsA81gm4RdzfppkhA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Sudeep Holla,
	christian.koenig-5C7GfCeVMHo

Hi Jean,

[...]
> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>  	if (ret)
>  		return ret;
>  
> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
> +		if (ret)
> +			return ret;

I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
the device probe will failed here, is this what we want?

Thanks
Yisheng

> +	}
> +
>  	/* And we're up. Go go go! */
>  	ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
>  				     "smmu3.%pa", &ioaddr);

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
@ 2018-03-19 11:03       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-19 11:03 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Mark Rutland, ilias.apalodimas, kvm, linux-pci, xuzaibo,
	jonathan.cameron, Will Deacon, okaya, robh, Lorenzo Pieralisi,
	ashok.raj, tn, joro, robdclark, bharatku, linux-acpi,
	Catalin Marinas, rfranz, lenb, devicetree, jacob.jun.pan,
	alex.williamson, yi.l.liu, thunder.leizhen, bhelgaas,
	linux-arm-kernel, shunyong.yang, dwmw2, liubo95, rjw, jcrouse,
	iommu, hanjun.guo, Sudeep Holla, Robin Murphy, christian.koenig,
	nwatters

Hi Jean,

[...]
> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>  	if (ret)
>  		return ret;
>  
> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
> +		if (ret)
> +			return ret;

I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
the device probe will failed here, is this what we want?

Thanks
Yisheng

> +	}
> +
>  	/* And we're up. Go go go! */
>  	ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
>  				     "smmu3.%pa", &ioaddr);


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
@ 2018-03-19 11:03       ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-19 11:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jean,

[...]
> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>  	if (ret)
>  		return ret;
>  
> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
> +		if (ret)
> +			return ret;

I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
the device probe will failed here, is this what we want?

Thanks
Yisheng

> +	}
> +
>  	/* And we're up. Go go go! */
>  	ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
>  				     "smmu3.%pa", &ioaddr);

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
  2018-03-19 11:03       ` Yisheng Xie
  (?)
@ 2018-03-21 13:24           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-21 13:24 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, robh-DgEjT+Ai2ygdnm+yROfE0A,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA, Catalin Marinas,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rjw-LthD3rsA81gHJAlzAAQw8A

Hi Yisheng,

On 19/03/18 11:03, Yisheng Xie wrote:
> Hi Jean,
> 
> [...]
>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>  	if (ret)
>>  		return ret;
>>  
>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>> +		if (ret)
>> +			return ret;
> 
> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
> the device probe will failed here, is this what we want?

Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
can happen. I'm not sure what the best practices are, but I feel like it
would be too much work to guard against config combinations that violate
an explicit "select" in Kconfig

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
@ 2018-03-21 13:24           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-21 13:24 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Mark Rutland, ilias.apalodimas, kvm, linux-pci, xuzaibo,
	jonathan.cameron, Will Deacon, okaya, robh, Lorenzo Pieralisi,
	ashok.raj, tn, joro, robdclark, bharatku, linux-acpi,
	Catalin Marinas, rfranz, lenb, devicetree, jacob.jun.pan,
	alex.williamson, yi.l.liu, thunder.leizhen, bhelgaas,
	linux-arm-kernel, shunyong.yang, dwmw2, liubo95, rjw, jcrouse,
	iommu, hanjun.guo, Sudeep Holla, Robin Murphy, christian.koenig,
	nwatters

Hi Yisheng,

On 19/03/18 11:03, Yisheng Xie wrote:
> Hi Jean,
> 
> [...]
>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>  	if (ret)
>>  		return ret;
>>  
>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>> +		if (ret)
>> +			return ret;
> 
> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
> the device probe will failed here, is this what we want?

Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
can happen. I'm not sure what the best practices are, but I feel like it
would be too much work to guard against config combinations that violate
an explicit "select" in Kconfig

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
@ 2018-03-21 13:24           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-21 13:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Yisheng,

On 19/03/18 11:03, Yisheng Xie wrote:
> Hi Jean,
> 
> [...]
>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>  	if (ret)
>>  		return ret;
>>  
>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>> +		if (ret)
>> +			return ret;
> 
> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
> the device probe will failed here, is this what we want?

Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
can happen. I'm not sure what the best practices are, but I feel like it
would be too much work to guard against config combinations that violate
an explicit "select" in Kconfig

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 37/37] vfio: Add support for Shared Virtual Addressing
  2018-03-19  9:47       ` Yisheng Xie
  (?)
@ 2018-03-21 13:40           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-21 13:40 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, robh-DgEjT+Ai2ygdnm+yROfE0A,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA, Catalin Marinas,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rjw-LthD3rsA81gHJAlzAAQw8A

On 19/03/18 09:47, Yisheng Xie wrote:
> Hi Jean,
> 
> vfio can be compiled as module, however you use some functions which are not
> exported.

Oh right. I remember the kbuild test robot warning about this once, I
wonder why it didn't find this one.

> comment inline:
> 
> [...]
>> Add two new ioctl for VFIO containers. VFIO_IOMMU_BIND_PROCESS creates a
>> bond between a container and a process address space, identified by a
>> device-specific ID named PASID. This allows the device to target DMA
>> transactions at the process virtual addresses without a need for mapping
>> and unmapping buffers explicitly in the IOMMU. The process page tables are
>> shared with the IOMMU, and mechanisms such as PCI ATS/PRI are used to
>> handle faults. VFIO_IOMMU_UNBIND_PROCESS removes a bond created with
>> VFIO_IOMMU_BIND_PROCESS.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
>> ---
> [...]
>> +static struct mm_struct *vfio_iommu_get_mm_by_vpid(pid_t vpid)
>> +{
>> +	struct mm_struct *mm;
>> +	struct task_struct *task;
>> +
>> +	rcu_read_lock();
>> +	task = find_task_by_vpid(vpid);
> 
> Maybe can use?
> 	task = pid_task(find_vpid(params.vpid), PIDTYPE_PID)

I'd rather submit a patch requesting to export the symbol. Especially
since this function can be further simplified by using the brand new
find_get_task_by_vpid() helper, introduced by 2ee0826085d1.

>> +	if (task)
>> +		get_task_struct(task);
>> +	rcu_read_unlock();
>> +	if (!task)
>> +		return ERR_PTR(-ESRCH);
>> +
>> +	/* Ensure that current has RW access on the mm */
>> +	mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);
> 
> You will try to export mm_access, I find Felix have tried to, but seems give up:
> 
>  https://patchwork.kernel.org/patch/9744281/

Thanks for the pointer, I'll try to revive this.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 37/37] vfio: Add support for Shared Virtual Addressing
@ 2018-03-21 13:40           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-21 13:40 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Mark Rutland, ilias.apalodimas, kvm, linux-pci, xuzaibo,
	jonathan.cameron, Will Deacon, okaya, robh, Lorenzo Pieralisi,
	ashok.raj, tn, joro, robdclark, bharatku, linux-acpi,
	Catalin Marinas, rfranz, lenb, devicetree, jacob.jun.pan,
	alex.williamson, yi.l.liu, thunder.leizhen, bhelgaas,
	linux-arm-kernel, shunyong.yang, dwmw2, liubo95, rjw, jcrouse,
	iommu, hanjun.guo, Sudeep Holla, Robin Murphy, christian.koenig,
	nwatters

On 19/03/18 09:47, Yisheng Xie wrote:
> Hi Jean,
> 
> vfio can be compiled as module, however you use some functions which are not
> exported.

Oh right. I remember the kbuild test robot warning about this once, I
wonder why it didn't find this one.

> comment inline:
> 
> [...]
>> Add two new ioctl for VFIO containers. VFIO_IOMMU_BIND_PROCESS creates a
>> bond between a container and a process address space, identified by a
>> device-specific ID named PASID. This allows the device to target DMA
>> transactions at the process virtual addresses without a need for mapping
>> and unmapping buffers explicitly in the IOMMU. The process page tables are
>> shared with the IOMMU, and mechanisms such as PCI ATS/PRI are used to
>> handle faults. VFIO_IOMMU_UNBIND_PROCESS removes a bond created with
>> VFIO_IOMMU_BIND_PROCESS.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> ---
> [...]
>> +static struct mm_struct *vfio_iommu_get_mm_by_vpid(pid_t vpid)
>> +{
>> +	struct mm_struct *mm;
>> +	struct task_struct *task;
>> +
>> +	rcu_read_lock();
>> +	task = find_task_by_vpid(vpid);
> 
> Maybe can use?
> 	task = pid_task(find_vpid(params.vpid), PIDTYPE_PID)

I'd rather submit a patch requesting to export the symbol. Especially
since this function can be further simplified by using the brand new
find_get_task_by_vpid() helper, introduced by 2ee0826085d1.

>> +	if (task)
>> +		get_task_struct(task);
>> +	rcu_read_unlock();
>> +	if (!task)
>> +		return ERR_PTR(-ESRCH);
>> +
>> +	/* Ensure that current has RW access on the mm */
>> +	mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);
> 
> You will try to export mm_access, I find Felix have tried to, but seems give up:
> 
>  https://patchwork.kernel.org/patch/9744281/

Thanks for the pointer, I'll try to revive this.

Thanks,
Jean


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 37/37] vfio: Add support for Shared Virtual Addressing
@ 2018-03-21 13:40           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-21 13:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/03/18 09:47, Yisheng Xie wrote:
> Hi Jean,
> 
> vfio can be compiled as module, however you use some functions which are not
> exported.

Oh right. I remember the kbuild test robot warning about this once, I
wonder why it didn't find this one.

> comment inline:
> 
> [...]
>> Add two new ioctl for VFIO containers. VFIO_IOMMU_BIND_PROCESS creates a
>> bond between a container and a process address space, identified by a
>> device-specific ID named PASID. This allows the device to target DMA
>> transactions at the process virtual addresses without a need for mapping
>> and unmapping buffers explicitly in the IOMMU. The process page tables are
>> shared with the IOMMU, and mechanisms such as PCI ATS/PRI are used to
>> handle faults. VFIO_IOMMU_UNBIND_PROCESS removes a bond created with
>> VFIO_IOMMU_BIND_PROCESS.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> ---
> [...]
>> +static struct mm_struct *vfio_iommu_get_mm_by_vpid(pid_t vpid)
>> +{
>> +	struct mm_struct *mm;
>> +	struct task_struct *task;
>> +
>> +	rcu_read_lock();
>> +	task = find_task_by_vpid(vpid);
> 
> Maybe can use?
> 	task = pid_task(find_vpid(params.vpid), PIDTYPE_PID)

I'd rather submit a patch requesting to export the symbol. Especially
since this function can be further simplified by using the brand new
find_get_task_by_vpid() helper, introduced by 2ee0826085d1.

>> +	if (task)
>> +		get_task_struct(task);
>> +	rcu_read_unlock();
>> +	if (!task)
>> +		return ERR_PTR(-ESRCH);
>> +
>> +	/* Ensure that current has RW access on the mm */
>> +	mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);
> 
> You will try to export mm_access, I find Felix have tried to, but seems give up:
> 
>  https://patchwork.kernel.org/patch/9744281/

Thanks for the pointer, I'll try to revive this.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
  2018-03-21 13:24           ` Jean-Philippe Brucker
  (?)
@ 2018-03-22  1:09               ` Yisheng Xie
  -1 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-22  1:09 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, robh-DgEjT+Ai2ygdnm+yROfE0A,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA, Catalin Marinas,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rjw-LthD3rsA81gHJAlzAAQw8A

Hi Jean,

On 2018/3/21 21:24, Jean-Philippe Brucker wrote:
> Hi Yisheng,
> 
> On 19/03/18 11:03, Yisheng Xie wrote:
>> Hi Jean,
>>
>> [...]
>>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>>  	if (ret)
>>>  		return ret;
>>>  
>>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>>> +		if (ret)
>>> +			return ret;
>>
>> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
>> the device probe will failed here, is this what we want?
> 
> Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
> can happen. 

But CONFIG_IOMMU_FAULT can be changed after select, maybe can make it as unchangeable.

Thanks
Yisheng

> I'm not sure what the best practices are, but I feel like it
> would be too much work to guard against config combinations that violate
> an explicit "select" in Kconfig
> 
> Thanks,
> Jean
> 
> .
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
@ 2018-03-22  1:09               ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-22  1:09 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Mark Rutland, ilias.apalodimas, kvm, linux-pci, xuzaibo,
	jonathan.cameron, Will Deacon, okaya, robh, Lorenzo Pieralisi,
	ashok.raj, tn, joro, robdclark, bharatku, linux-acpi,
	Catalin Marinas, rfranz, lenb, devicetree, jacob.jun.pan,
	alex.williamson, yi.l.liu, thunder.leizhen, bhelgaas,
	linux-arm-kernel, shunyong.yang, dwmw2, liubo95, rjw, jcrouse,
	iommu, hanjun.guo, Sudeep Holla, Robin Murphy, christian.koenig,
	nwatters

Hi Jean,

On 2018/3/21 21:24, Jean-Philippe Brucker wrote:
> Hi Yisheng,
> 
> On 19/03/18 11:03, Yisheng Xie wrote:
>> Hi Jean,
>>
>> [...]
>>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>>  	if (ret)
>>>  		return ret;
>>>  
>>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>>> +		if (ret)
>>> +			return ret;
>>
>> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
>> the device probe will failed here, is this what we want?
> 
> Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
> can happen. 

But CONFIG_IOMMU_FAULT can be changed after select, maybe can make it as unchangeable.

Thanks
Yisheng

> I'm not sure what the best practices are, but I feel like it
> would be too much work to guard against config combinations that violate
> an explicit "select" in Kconfig
> 
> Thanks,
> Jean
> 
> .
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
@ 2018-03-22  1:09               ` Yisheng Xie
  0 siblings, 0 replies; 62+ messages in thread
From: Yisheng Xie @ 2018-03-22  1:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jean,

On 2018/3/21 21:24, Jean-Philippe Brucker wrote:
> Hi Yisheng,
> 
> On 19/03/18 11:03, Yisheng Xie wrote:
>> Hi Jean,
>>
>> [...]
>>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>>  	if (ret)
>>>  		return ret;
>>>  
>>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>>> +		if (ret)
>>> +			return ret;
>>
>> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
>> the device probe will failed here, is this what we want?
> 
> Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
> can happen. 

But CONFIG_IOMMU_FAULT can be changed after select, maybe can make it as unchangeable.

Thanks
Yisheng

> I'm not sure what the best practices are, but I feel like it
> would be too much work to guard against config combinations that violate
> an explicit "select" in Kconfig
> 
> Thanks,
> Jean
> 
> .
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
  2018-03-22  1:09               ` Yisheng Xie
  (?)
@ 2018-04-04 10:13                   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-04 10:13 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	kvm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, robh-DgEjT+Ai2ygdnm+yROfE0A,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA, Catalin Marinas,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rjw-LthD3rsA81gHJAlzAAQw8A

On 22/03/18 01:09, Yisheng Xie wrote:
> Hi Jean,
> 
> On 2018/3/21 21:24, Jean-Philippe Brucker wrote:
>> Hi Yisheng,
>>
>> On 19/03/18 11:03, Yisheng Xie wrote:
>>> Hi Jean,
>>>
>>> [...]
>>>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>>>  	if (ret)
>>>>  		return ret;
>>>>  
>>>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>>>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>>>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>>>> +		if (ret)
>>>> +			return ret;
>>>
>>> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
>>> the device probe will failed here, is this what we want?
>>
>> Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
>> can happen. 
> 
> But CONFIG_IOMMU_FAULT can be changed after select, maybe can make it as unchangeable. Seems sensible, I don't see a reason to leave IOMMU_FAULT selectable
manually.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
@ 2018-04-04 10:13                   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-04 10:13 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Mark Rutland, ilias.apalodimas, kvm, linux-pci, xuzaibo,
	jonathan.cameron, Will Deacon, okaya, robh, Lorenzo Pieralisi,
	ashok.raj, tn, joro, robdclark, bharatku, linux-acpi,
	Catalin Marinas, rfranz, lenb, devicetree, jacob.jun.pan,
	alex.williamson, yi.l.liu, thunder.leizhen, bhelgaas,
	linux-arm-kernel, shunyong.yang, dwmw2, liubo95, rjw, jcrouse,
	iommu, hanjun.guo, Sudeep Holla, Robin Murphy, christian.koenig,
	nwatters

On 22/03/18 01:09, Yisheng Xie wrote:
> Hi Jean,
> 
> On 2018/3/21 21:24, Jean-Philippe Brucker wrote:
>> Hi Yisheng,
>>
>> On 19/03/18 11:03, Yisheng Xie wrote:
>>> Hi Jean,
>>>
>>> [...]
>>>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>>>  	if (ret)
>>>>  		return ret;
>>>>  
>>>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>>>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>>>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>>>> +		if (ret)
>>>> +			return ret;
>>>
>>> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
>>> the device probe will failed here, is this what we want?
>>
>> Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
>> can happen. 
> 
> But CONFIG_IOMMU_FAULT can be changed after select, maybe can make it as unchangeable. Seems sensible, I don't see a reason to leave IOMMU_FAULT selectable
manually.

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue
@ 2018-04-04 10:13                   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-04 10:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 22/03/18 01:09, Yisheng Xie wrote:
> Hi Jean,
> 
> On 2018/3/21 21:24, Jean-Philippe Brucker wrote:
>> Hi Yisheng,
>>
>> On 19/03/18 11:03, Yisheng Xie wrote:
>>> Hi Jean,
>>>
>>> [...]
>>>> @@ -3168,6 +3260,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>>>>  	if (ret)
>>>>  		return ret;
>>>>  
>>>> +	if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) {
>>>> +		smmu->faultq_nb.notifier_call = arm_smmu_flush_queues;
>>>> +		ret = iommu_fault_queue_register(&smmu->faultq_nb);
>>>> +		if (ret)
>>>> +			return ret;
>>>
>>> I find a case here: CONFIG_IOMMU_FAULT=n, and smmu support feature STALLS or PRI,
>>> the device probe will failed here, is this what we want?
>>
>> Since CONFIG_ARM_SMMU_V3 selects CONFIG_IOMMU_FAULT, I don't think it
>> can happen. 
> 
> But CONFIG_IOMMU_FAULT can be changed after select, maybe can make it as unchangeable. Seems sensible, I don't see a reason to leave IOMMU_FAULT selectable
manually.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-04-24 17:17                 ` Sinan Kaya
@ 2018-04-24 18:52                     ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker via iommu @ 2018-04-24 18:52 UTC (permalink / raw)
  To: Sinan Kaya, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	Catalin Marinas, xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	lenb-DgEjT+Ai2ygdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rjw-LthD3rsA81gm4RdzfppkhA, Sudeep Holla,
	christian.koenig-5C7GfCeVMHo

On 24/04/18 18:17, Sinan Kaya wrote:
> On 4/24/2018 5:33 AM, Jean-Philippe Brucker wrote:
>>> Please return pasid when you find an io_mm that is already bound. Something like
>>> *pasid = io_mm->pasid should do the work here when bond is true.
>> Right. I think we should also keep returning 0, not switch to -EEXIST or
>> similar. So in next version a driver can call bind(devX, mmY) multiple
>> times, but the first unbind() removes the bond.
> 
> If we are going to allow multiple binds, then the last unbind should
> remove the bond rather than the first one via reference counting.

Yeah that's probably better. Since a bond belongs to a device driver it
doesn't need multiple bind/unbind, so earlier in this thread (1/37) I
talked about removing the bond->refs. But thinking about it, there still
is a need for it. When mm exits, we now need to call the device driver's
mm_exit handler outside of the spinlock, so we have to take a ref in
order to prevent a concurrent unbind() from freeing the bond.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-24 18:52                     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-24 18:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/04/18 18:17, Sinan Kaya wrote:
> On 4/24/2018 5:33 AM, Jean-Philippe Brucker wrote:
>>> Please return pasid when you find an io_mm that is already bound. Something like
>>> *pasid = io_mm->pasid should do the work here when bond is true.
>> Right. I think we should also keep returning 0, not switch to -EEXIST or
>> similar. So in next version a driver can call bind(devX, mmY) multiple
>> times, but the first unbind() removes the bond.
> 
> If we are going to allow multiple binds, then the last unbind should
> remove the bond rather than the first one via reference counting.

Yeah that's probably better. Since a bond belongs to a device driver it
doesn't need multiple bind/unbind, so earlier in this thread (1/37) I
talked about removing the bond->refs. But thinking about it, there still
is a need for it. When mm exits, we now need to call the device driver's
mm_exit handler outside of the spinlock, so we have to take a ref in
order to prevent a concurrent unbind() from freeing the bond.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-04-24  9:33             ` Jean-Philippe Brucker
  (?)
@ 2018-04-24 17:17                 ` Sinan Kaya
  -1 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-24 17:17 UTC (permalink / raw)
  To: Jean-Philippe Brucker,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	Catalin Marinas, xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	lenb-DgEjT+Ai2ygdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rjw-LthD3rsA81gm4RdzfppkhA, Sudeep Holla,
	christian.koenig-5C7GfCeVMHo

On 4/24/2018 5:33 AM, Jean-Philippe Brucker wrote:
>> Please return pasid when you find an io_mm that is already bound. Something like
>> *pasid = io_mm->pasid should do the work here when bond is true.
> Right. I think we should also keep returning 0, not switch to -EEXIST or
> similar. So in next version a driver can call bind(devX, mmY) multiple
> times, but the first unbind() removes the bond.

If we are going to allow multiple binds, then the last unbind should
remove the bond rather than the first one via reference counting.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-24 17:17                 ` Sinan Kaya
  0 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-24 17:17 UTC (permalink / raw)
  To: Jean-Philippe Brucker, linux-arm-kernel, linux-pci, linux-acpi,
	devicetree, iommu, kvm
  Cc: joro, robh+dt, Mark Rutland, Catalin Marinas, Will Deacon,
	Lorenzo Pieralisi, hanjun.guo, Sudeep Holla, rjw, lenb,
	Robin Murphy, bhelgaas, alex.williamson, tn, liubo95,
	thunder.leizhen, xieyisheng1, xuzaibo, ilias.apalodimas,
	jonathan.cameron, shunyong.yang, nwatters, jcrouse, rfranz,
	dwmw2, jacob.jun.pan, yi.l.liu, ashok.raj, robdclark,
	christian.koenig, bharatku

On 4/24/2018 5:33 AM, Jean-Philippe Brucker wrote:
>> Please return pasid when you find an io_mm that is already bound. Something like
>> *pasid = io_mm->pasid should do the work here when bond is true.
> Right. I think we should also keep returning 0, not switch to -EEXIST or
> similar. So in next version a driver can call bind(devX, mmY) multiple
> times, but the first unbind() removes the bond.

If we are going to allow multiple binds, then the last unbind should
remove the bond rather than the first one via reference counting.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-24 17:17                 ` Sinan Kaya
  0 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-24 17:17 UTC (permalink / raw)
  To: linux-arm-kernel

On 4/24/2018 5:33 AM, Jean-Philippe Brucker wrote:
>> Please return pasid when you find an io_mm that is already bound. Something like
>> *pasid = io_mm->pasid should do the work here when bond is true.
> Right. I think we should also keep returning 0, not switch to -EEXIST or
> similar. So in next version a driver can call bind(devX, mmY) multiple
> times, but the first unbind() removes the bond.

If we are going to allow multiple binds, then the last unbind should
remove the bond rather than the first one via reference counting.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-04-24  1:32         ` Sinan Kaya
  (?)
@ 2018-04-24  9:33             ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-24  9:33 UTC (permalink / raw)
  To: Sinan Kaya, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	Catalin Marinas, xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	lenb-DgEjT+Ai2ygdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rjw-LthD3rsA81gm4RdzfppkhA, Sudeep Holla,
	christian.koenig-5C7GfCeVMHo

On 24/04/18 02:32, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> /**
>>   * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
>>   * @dev: the device
>> @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
>>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>>  			  unsigned long flags, void *drvdata)
>>  {
>> +	int i, ret;
>> +	struct io_mm *io_mm = NULL;
>>  	struct iommu_domain *domain;
>> +	struct iommu_bond *bond = NULL, *tmp;
>>  	struct iommu_param *dev_param = dev->iommu_param;
>>  
>>  	domain = iommu_get_domain_for_dev(dev);
>> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>>  		return -EINVAL;
>>  
>> -	return -ENOSYS; /* TODO */
>> +	/* If an io_mm already exists, use it */
>> +	spin_lock(&iommu_sva_lock);
>> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
>> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
>> +			continue;
>> +
>> +		/* Is it already bound to this device? */
>> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
>> +			if (tmp->dev != dev)
>> +				continue;
>> +
>> +			bond = tmp;
>> +			refcount_inc(&bond->refs);
>> +			io_mm_put_locked(io_mm);
>> +			break;
>> +		}
>> +		break;
>> +	}
>> +	spin_unlock(&iommu_sva_lock);
>> +
>> +	if (bond)
> 
> Please return pasid when you find an io_mm that is already bound. Something like
> *pasid = io_mm->pasid should do the work here when bond is true.

Right. I think we should also keep returning 0, not switch to -EEXIST or
similar. So in next version a driver can call bind(devX, mmY) multiple
times, but the first unbind() removes the bond.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-24  9:33             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-24  9:33 UTC (permalink / raw)
  To: Sinan Kaya, linux-arm-kernel, linux-pci, linux-acpi, devicetree,
	iommu, kvm
  Cc: Mark Rutland, xieyisheng1, ilias.apalodimas, Catalin Marinas,
	xuzaibo, jonathan.cameron, Will Deacon, yi.l.liu,
	Lorenzo Pieralisi, ashok.raj, tn, joro, bharatku, rfranz, lenb,
	jacob.jun.pan, alex.williamson, robh+dt, thunder.leizhen,
	bhelgaas, shunyong.yang, dwmw2, liubo95, rjw, jcrouse, robdclark,
	hanjun.guo, Sudeep Holla, Robin Murphy, christian.koenig,
	nwatters

On 24/04/18 02:32, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> /**
>>   * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
>>   * @dev: the device
>> @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
>>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>>  			  unsigned long flags, void *drvdata)
>>  {
>> +	int i, ret;
>> +	struct io_mm *io_mm = NULL;
>>  	struct iommu_domain *domain;
>> +	struct iommu_bond *bond = NULL, *tmp;
>>  	struct iommu_param *dev_param = dev->iommu_param;
>>  
>>  	domain = iommu_get_domain_for_dev(dev);
>> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>>  		return -EINVAL;
>>  
>> -	return -ENOSYS; /* TODO */
>> +	/* If an io_mm already exists, use it */
>> +	spin_lock(&iommu_sva_lock);
>> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
>> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
>> +			continue;
>> +
>> +		/* Is it already bound to this device? */
>> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
>> +			if (tmp->dev != dev)
>> +				continue;
>> +
>> +			bond = tmp;
>> +			refcount_inc(&bond->refs);
>> +			io_mm_put_locked(io_mm);
>> +			break;
>> +		}
>> +		break;
>> +	}
>> +	spin_unlock(&iommu_sva_lock);
>> +
>> +	if (bond)
> 
> Please return pasid when you find an io_mm that is already bound. Something like
> *pasid = io_mm->pasid should do the work here when bond is true.

Right. I think we should also keep returning 0, not switch to -EEXIST or
similar. So in next version a driver can call bind(devX, mmY) multiple
times, but the first unbind() removes the bond.

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-24  9:33             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-24  9:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/04/18 02:32, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> /**
>>   * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
>>   * @dev: the device
>> @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
>>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>>  			  unsigned long flags, void *drvdata)
>>  {
>> +	int i, ret;
>> +	struct io_mm *io_mm = NULL;
>>  	struct iommu_domain *domain;
>> +	struct iommu_bond *bond = NULL, *tmp;
>>  	struct iommu_param *dev_param = dev->iommu_param;
>>  
>>  	domain = iommu_get_domain_for_dev(dev);
>> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>>  		return -EINVAL;
>>  
>> -	return -ENOSYS; /* TODO */
>> +	/* If an io_mm already exists, use it */
>> +	spin_lock(&iommu_sva_lock);
>> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
>> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
>> +			continue;
>> +
>> +		/* Is it already bound to this device? */
>> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
>> +			if (tmp->dev != dev)
>> +				continue;
>> +
>> +			bond = tmp;
>> +			refcount_inc(&bond->refs);
>> +			io_mm_put_locked(io_mm);
>> +			break;
>> +		}
>> +		break;
>> +	}
>> +	spin_unlock(&iommu_sva_lock);
>> +
>> +	if (bond)
> 
> Please return pasid when you find an io_mm that is already bound. Something like
> *pasid = io_mm->pasid should do the work here when bond is true.

Right. I think we should also keep returning 0, not switch to -EEXIST or
similar. So in next version a driver can call bind(devX, mmY) multiple
times, but the first unbind() removes the bond.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-02-12 18:33     ` Jean-Philippe Brucker
  (?)
@ 2018-04-24  1:32         ` Sinan Kaya
  -1 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-24  1:32 UTC (permalink / raw)
  To: Jean-Philippe Brucker,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: mark.rutland-5wv7dgnIgG8,
	ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	catalin.marinas-5wv7dgnIgG8, xuzaibo-hv44wF8Li93QT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	lenb-DgEjT+Ai2ygdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rjw-LthD3rsA81gm4RdzfppkhA, sudeep.holla-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> /**
>   * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
>   * @dev: the device
> @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  			  unsigned long flags, void *drvdata)
>  {
> +	int i, ret;
> +	struct io_mm *io_mm = NULL;
>  	struct iommu_domain *domain;
> +	struct iommu_bond *bond = NULL, *tmp;
>  	struct iommu_param *dev_param = dev->iommu_param;
>  
>  	domain = iommu_get_domain_for_dev(dev);
> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>  		return -EINVAL;
>  
> -	return -ENOSYS; /* TODO */
> +	/* If an io_mm already exists, use it */
> +	spin_lock(&iommu_sva_lock);
> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
> +			continue;
> +
> +		/* Is it already bound to this device? */
> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
> +			if (tmp->dev != dev)
> +				continue;
> +
> +			bond = tmp;
> +			refcount_inc(&bond->refs);
> +			io_mm_put_locked(io_mm);
> +			break;
> +		}
> +		break;
> +	}
> +	spin_unlock(&iommu_sva_lock);
> +
> +	if (bond)

Please return pasid when you find an io_mm that is already bound. Something like
*pasid = io_mm->pasid should do the work here when bond is true.

> +		return 0;


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-24  1:32         ` Sinan Kaya
  0 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-24  1:32 UTC (permalink / raw)
  To: Jean-Philippe Brucker, linux-arm-kernel, linux-pci, linux-acpi,
	devicetree, iommu, kvm
  Cc: mark.rutland, xieyisheng1, ilias.apalodimas, catalin.marinas,
	xuzaibo, jonathan.cameron, will.deacon, yi.l.liu,
	lorenzo.pieralisi, ashok.raj, tn, joro, bharatku, rfranz, lenb,
	jacob.jun.pan, alex.williamson, robh+dt, thunder.leizhen,
	bhelgaas, shunyong.yang, dwmw2, liubo95, rjw, jcrouse, robdclark,
	hanjun.guo, sudeep.holla, robin.murphy, christian.koenig,
	nwatters

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> /**
>   * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
>   * @dev: the device
> @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  			  unsigned long flags, void *drvdata)
>  {
> +	int i, ret;
> +	struct io_mm *io_mm = NULL;
>  	struct iommu_domain *domain;
> +	struct iommu_bond *bond = NULL, *tmp;
>  	struct iommu_param *dev_param = dev->iommu_param;
>  
>  	domain = iommu_get_domain_for_dev(dev);
> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>  		return -EINVAL;
>  
> -	return -ENOSYS; /* TODO */
> +	/* If an io_mm already exists, use it */
> +	spin_lock(&iommu_sva_lock);
> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
> +			continue;
> +
> +		/* Is it already bound to this device? */
> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
> +			if (tmp->dev != dev)
> +				continue;
> +
> +			bond = tmp;
> +			refcount_inc(&bond->refs);
> +			io_mm_put_locked(io_mm);
> +			break;
> +		}
> +		break;
> +	}
> +	spin_unlock(&iommu_sva_lock);
> +
> +	if (bond)

Please return pasid when you find an io_mm that is already bound. Something like
*pasid = io_mm->pasid should do the work here when bond is true.

> +		return 0;


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-24  1:32         ` Sinan Kaya
  0 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-24  1:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> /**
>   * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
>   * @dev: the device
> @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
>  int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  			  unsigned long flags, void *drvdata)
>  {
> +	int i, ret;
> +	struct io_mm *io_mm = NULL;
>  	struct iommu_domain *domain;
> +	struct iommu_bond *bond = NULL, *tmp;
>  	struct iommu_param *dev_param = dev->iommu_param;
>  
>  	domain = iommu_get_domain_for_dev(dev);
> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
>  	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
>  		return -EINVAL;
>  
> -	return -ENOSYS; /* TODO */
> +	/* If an io_mm already exists, use it */
> +	spin_lock(&iommu_sva_lock);
> +	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
> +		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
> +			continue;
> +
> +		/* Is it already bound to this device? */
> +		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
> +			if (tmp->dev != dev)
> +				continue;
> +
> +			bond = tmp;
> +			refcount_inc(&bond->refs);
> +			io_mm_put_locked(io_mm);
> +			break;
> +		}
> +		break;
> +	}
> +	spin_unlock(&iommu_sva_lock);
> +
> +	if (bond)

Please return pasid when you find an io_mm that is already bound. Something like
*pasid = io_mm->pasid should do the work here when bond is true.

> +		return 0;


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-04-10 18:53       ` Sinan Kaya
  (?)
@ 2018-04-13 10:59           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-13 10:59 UTC (permalink / raw)
  To: Sinan Kaya, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	Catalin Marinas, xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	lenb-DgEjT+Ai2ygdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rjw-LthD3rsA81gm4RdzfppkhA, Sudeep Holla,
	christian.koenig-5C7GfCeVMHo

On 10/04/18 19:53, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> +static void io_mm_detach_all_locked(struct iommu_bond *bond)
>> +{
>> +	while (!io_mm_detach_locked(bond));
>> +}
>> +
> 
> I don't remember if I mentioned this before or not but I think this loop
> needs a little bit relaxation with yield and maybe an informational message
> with might help if wait exceeds some time.

Right, at the very least we should have a cpu_relax here. I think this
bit is going away, though, because I want to lift the possibility of
calling bind() for the same dev/mm pair multiple times. It's not useful
in my opinion because that call could only be issued by a given driver.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-13 10:59           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-13 10:59 UTC (permalink / raw)
  To: Sinan Kaya, linux-arm-kernel, linux-pci, linux-acpi, devicetree,
	iommu, kvm
  Cc: joro, robh+dt, Mark Rutland, Catalin Marinas, Will Deacon,
	Lorenzo Pieralisi, hanjun.guo, Sudeep Holla, rjw, lenb,
	Robin Murphy, bhelgaas, alex.williamson, tn, liubo95,
	thunder.leizhen, xieyisheng1, xuzaibo, ilias.apalodimas,
	jonathan.cameron, shunyong.yang, nwatters, jcrouse, rfranz,
	dwmw2, jacob.jun.pan, yi.l.liu, ashok.raj, robdclark,
	christian.koenig, bharatku

On 10/04/18 19:53, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> +static void io_mm_detach_all_locked(struct iommu_bond *bond)
>> +{
>> +	while (!io_mm_detach_locked(bond));
>> +}
>> +
> 
> I don't remember if I mentioned this before or not but I think this loop
> needs a little bit relaxation with yield and maybe an informational message
> with might help if wait exceeds some time.

Right, at the very least we should have a cpu_relax here. I think this
bit is going away, though, because I want to lift the possibility of
calling bind() for the same dev/mm pair multiple times. It's not useful
in my opinion because that call could only be issued by a given driver.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-13 10:59           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-04-13 10:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/04/18 19:53, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> +static void io_mm_detach_all_locked(struct iommu_bond *bond)
>> +{
>> +	while (!io_mm_detach_locked(bond));
>> +}
>> +
> 
> I don't remember if I mentioned this before or not but I think this loop
> needs a little bit relaxation with yield and maybe an informational message
> with might help if wait exceeds some time.

Right, at the very least we should have a cpu_relax here. I think this
bit is going away, though, because I want to lift the possibility of
calling bind() for the same dev/mm pair multiple times. It's not useful
in my opinion because that call could only be issued by a given driver.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-02-12 18:33     ` Jean-Philippe Brucker
  (?)
@ 2018-04-10 18:53       ` Sinan Kaya
  -1 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-10 18:53 UTC (permalink / raw)
  To: Jean-Philippe Brucker, linux-arm-kernel, linux-pci, linux-acpi,
	devicetree, iommu, kvm
  Cc: mark.rutland, xieyisheng1, ilias.apalodimas, catalin.marinas,
	xuzaibo, jonathan.cameron, will.deacon, yi.l.liu,
	lorenzo.pieralisi, ashok.raj, tn, joro, bharatku, rfranz, lenb,
	jacob.jun.pan, alex.williamson, robh+dt, thunder.leizhen,
	bhelgaas, shunyong.yang, dwmw2, liubo95, rjw, jcrouse, robdclark,
	hanjun.guo, sudeep.holla, robin.murphy, christian.koenig,
	nwatters

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> +static void io_mm_detach_all_locked(struct iommu_bond *bond)
> +{
> +	while (!io_mm_detach_locked(bond));
> +}
> +

I don't remember if I mentioned this before or not but I think this loop
needs a little bit relaxation with yield and maybe an informational message
with might help if wait exceeds some time.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-10 18:53       ` Sinan Kaya
  0 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-10 18:53 UTC (permalink / raw)
  To: Jean-Philippe Brucker, linux-arm-kernel, linux-pci, linux-acpi,
	devicetree, iommu, kvm
  Cc: mark.rutland, xieyisheng1, ilias.apalodimas, catalin.marinas,
	xuzaibo, jonathan.cameron, will.deacon, yi.l.liu,
	lorenzo.pieralisi, ashok.raj, tn, joro, bharatku, rfranz, lenb,
	jacob.jun.pan, alex.williamson, robh+dt, thunder.leizhen,
	bhelgaas, shunyong.yang, dwmw2, liubo95, rjw, jcrouse, robdclark,
	hanjun.guo, sudeep.holla, robin.murphy, christian.koenig,
	nwatters

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> +static void io_mm_detach_all_locked(struct iommu_bond *bond)
> +{
> +	while (!io_mm_detach_locked(bond));
> +}
> +

I don't remember if I mentioned this before or not but I think this loop
needs a little bit relaxation with yield and maybe an informational message
with might help if wait exceeds some time.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-04-10 18:53       ` Sinan Kaya
  0 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-04-10 18:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> +static void io_mm_detach_all_locked(struct iommu_bond *bond)
> +{
> +	while (!io_mm_detach_locked(bond));
> +}
> +

I don't remember if I mentioned this before or not but I think this loop
needs a little bit relaxation with yield and maybe an informational message
with might help if wait exceeds some time.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-03-05 15:28         ` Sinan Kaya
  (?)
@ 2018-03-06 10:37             ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-06 10:37 UTC (permalink / raw)
  To: Sinan Kaya, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	Catalin Marinas, xuzaibo-hv44wF8Li93QT0dZR+AlfA, Will Deacon,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	lenb-DgEjT+Ai2ygdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rjw-LthD3rsA81gm4RdzfppkhA, Sudeep Holla,
	christian.koenig-5C7GfCeVMHo

On 05/03/18 15:28, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> +static void io_mm_free(struct io_mm *io_mm)
>> +{
>> +	struct mm_struct *mm;
>> +	void (*release)(struct io_mm *);
>> +
>> +	release = io_mm->release;
>> +	mm = io_mm->mm;
>> +
>> +	release(io_mm);
> 
> Is there any reason why you can't call iommu->release()
> here directly? Why do you need the release local variable?

I think I can remove the local variable

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-06 10:37             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-06 10:37 UTC (permalink / raw)
  To: Sinan Kaya, linux-arm-kernel, linux-pci, linux-acpi, devicetree,
	iommu, kvm
  Cc: joro, robh+dt, Mark Rutland, Catalin Marinas, Will Deacon,
	Lorenzo Pieralisi, hanjun.guo, Sudeep Holla, rjw, lenb,
	Robin Murphy, bhelgaas, alex.williamson, tn, liubo95,
	thunder.leizhen, xieyisheng1, xuzaibo, ilias.apalodimas,
	jonathan.cameron, shunyong.yang, nwatters, jcrouse, rfranz,
	dwmw2, jacob.jun.pan, yi.l.liu, ashok.raj, robdclark,
	christian.koenig, bharatku

On 05/03/18 15:28, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> +static void io_mm_free(struct io_mm *io_mm)
>> +{
>> +	struct mm_struct *mm;
>> +	void (*release)(struct io_mm *);
>> +
>> +	release = io_mm->release;
>> +	mm = io_mm->mm;
>> +
>> +	release(io_mm);
> 
> Is there any reason why you can't call iommu->release()
> here directly? Why do you need the release local variable?

I think I can remove the local variable

Thanks,
Jean


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-06 10:37             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-06 10:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/03/18 15:28, Sinan Kaya wrote:
> On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
>> +static void io_mm_free(struct io_mm *io_mm)
>> +{
>> +	struct mm_struct *mm;
>> +	void (*release)(struct io_mm *);
>> +
>> +	release = io_mm->release;
>> +	mm = io_mm->mm;
>> +
>> +	release(io_mm);
> 
> Is there any reason why you can't call iommu->release()
> here directly? Why do you need the release local variable?

I think I can remove the local variable

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-02-12 18:33     ` Jean-Philippe Brucker
  (?)
@ 2018-03-05 15:28         ` Sinan Kaya
  -1 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-03-05 15:28 UTC (permalink / raw)
  To: Jean-Philippe Brucker,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: mark.rutland-5wv7dgnIgG8,
	ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	catalin.marinas-5wv7dgnIgG8, xuzaibo-hv44wF8Li93QT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	lenb-DgEjT+Ai2ygdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rjw-LthD3rsA81gm4RdzfppkhA, sudeep.holla-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> +static void io_mm_free(struct io_mm *io_mm)
> +{
> +	struct mm_struct *mm;
> +	void (*release)(struct io_mm *);
> +
> +	release = io_mm->release;
> +	mm = io_mm->mm;
> +
> +	release(io_mm);

Is there any reason why you can't call iommu->release()
here directly? Why do you need the release local variable?

> +	mmdrop(mm);
> +}
> +


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-05 15:28         ` Sinan Kaya
  0 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-03-05 15:28 UTC (permalink / raw)
  To: Jean-Philippe Brucker, linux-arm-kernel, linux-pci, linux-acpi,
	devicetree, iommu, kvm
  Cc: joro, robh+dt, mark.rutland, catalin.marinas, will.deacon,
	lorenzo.pieralisi, hanjun.guo, sudeep.holla, rjw, lenb,
	robin.murphy, bhelgaas, alex.williamson, tn, liubo95,
	thunder.leizhen, xieyisheng1, xuzaibo, ilias.apalodimas,
	jonathan.cameron, shunyong.yang, nwatters, jcrouse, rfranz,
	dwmw2, jacob.jun.pan, yi.l.liu, ashok.raj, robdclark,
	christian.koenig, bharatku

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> +static void io_mm_free(struct io_mm *io_mm)
> +{
> +	struct mm_struct *mm;
> +	void (*release)(struct io_mm *);
> +
> +	release = io_mm->release;
> +	mm = io_mm->mm;
> +
> +	release(io_mm);

Is there any reason why you can't call iommu->release()
here directly? Why do you need the release local variable?

> +	mmdrop(mm);
> +}
> +


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-05 15:28         ` Sinan Kaya
  0 siblings, 0 replies; 62+ messages in thread
From: Sinan Kaya @ 2018-03-05 15:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote:
> +static void io_mm_free(struct io_mm *io_mm)
> +{
> +	struct mm_struct *mm;
> +	void (*release)(struct io_mm *);
> +
> +	release = io_mm->release;
> +	mm = io_mm->mm;
> +
> +	release(io_mm);

Is there any reason why you can't call iommu->release()
here directly? Why do you need the release local variable?

> +	mmdrop(mm);
> +}
> +


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-03-01  8:04             ` Christian König
  (?)
@ 2018-03-02 16:42                 ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-02 16:42 UTC (permalink / raw)
  To: Christian König, Lu Baolu,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w, rjw-LthD3rsA81gm4RdzfppkhA,
	Catalin Marinas, xuzaibo-hv44wF8Li93QT0dZR+AlfA,
	ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, Sudeep Holla,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	lenb-DgEjT+Ai2ygdnm+yROfE0A

On 01/03/18 08:04, Christian König wrote:
> Am 01.03.2018 um 07:52 schrieb Lu Baolu:
>> Hi Jean,
>>
>> On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote:
>>> [SNIP]
>>> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
>>> +				 dev_param->max_pasid + 1, GFP_ATOMIC);
>> Can the pasid management code be moved into a common library?
>> PASID is not stick to SVA. An IOMMU model device could be designed
>> to use PASID for second level translation (classical DMA translation)
>> as well.
> 
> Yeah, we have the same problem on amdgpu.
> 
> We assign PASIDs to clients even when IOMMU isn't present in the system 
> just because we need it for debugging.
> 
> E.g. when the hardware detects that some shader program is doing 
> something nasty we get the PASID in the interrupt and could for example 
> use it to inform the client about the fault.

This seems like a new requirement altogether, and a bit outside the
scope of this series to be honest. Is the client userspace in this
context? I guess it would be mostly for prototyping then? Otherwise you
probably don't want to hand GPU contexts to userspace without an IOMMU
isolating them?

If you don't need mm tracking/sharing or iommu_map/unmap, then maybe an
IDR private to the GPU driver would be enough? If you do need mm
tracking, I suppose we could modify iommu_sva_bind() to allocate and
track io_mm even if the given device doesn't have an IOMMU, but it seems
a bit backward.

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-02 16:42                 ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-02 16:42 UTC (permalink / raw)
  To: Christian König, Lu Baolu, linux-arm-kernel, linux-pci,
	linux-acpi, devicetree, iommu, kvm
  Cc: Mark Rutland, bharatku, ashok.raj, shunyong.yang, rjw,
	Catalin Marinas, xuzaibo, ilias.apalodimas, Will Deacon, okaya,
	bhelgaas, robh+dt, Sudeep Holla, rfranz, dwmw2, lenb

T24gMDEvMDMvMTggMDg6MDQsIENocmlzdGlhbiBLw7ZuaWcgd3JvdGU6Cj4gQW0gMDEuMDMuMjAx
OCB1bSAwNzo1MiBzY2hyaWViIEx1IEJhb2x1Ogo+PiBIaSBKZWFuLAo+Pgo+PiBPbiAwMi8xMy8y
MDE4IDAyOjMzIEFNLCBKZWFuLVBoaWxpcHBlIEJydWNrZXIgd3JvdGU6Cj4+PiBbU05JUF0KPj4+
ICsJcGFzaWQgPSBpZHJfYWxsb2NfY3ljbGljKCZpb21tdV9wYXNpZF9pZHIsIGlvX21tLCBkZXZf
cGFyYW0tPm1pbl9wYXNpZCwKPj4+ICsJCQkJIGRldl9wYXJhbS0+bWF4X3Bhc2lkICsgMSwgR0ZQ
X0FUT01JQyk7Cj4+IENhbiB0aGUgcGFzaWQgbWFuYWdlbWVudCBjb2RlIGJlIG1vdmVkIGludG8g
YSBjb21tb24gbGlicmFyeT8KPj4gUEFTSUQgaXMgbm90IHN0aWNrIHRvIFNWQS4gQW4gSU9NTVUg
bW9kZWwgZGV2aWNlIGNvdWxkIGJlIGRlc2lnbmVkCj4+IHRvIHVzZSBQQVNJRCBmb3Igc2Vjb25k
IGxldmVsIHRyYW5zbGF0aW9uIChjbGFzc2ljYWwgRE1BIHRyYW5zbGF0aW9uKQo+PiBhcyB3ZWxs
Lgo+IAo+IFllYWgsIHdlIGhhdmUgdGhlIHNhbWUgcHJvYmxlbSBvbiBhbWRncHUuCj4gCj4gV2Ug
YXNzaWduIFBBU0lEcyB0byBjbGllbnRzIGV2ZW4gd2hlbiBJT01NVSBpc24ndCBwcmVzZW50IGlu
IHRoZSBzeXN0ZW0gCj4ganVzdCBiZWNhdXNlIHdlIG5lZWQgaXQgZm9yIGRlYnVnZ2luZy4KPiAK
PiBFLmcuIHdoZW4gdGhlIGhhcmR3YXJlIGRldGVjdHMgdGhhdCBzb21lIHNoYWRlciBwcm9ncmFt
IGlzIGRvaW5nIAo+IHNvbWV0aGluZyBuYXN0eSB3ZSBnZXQgdGhlIFBBU0lEIGluIHRoZSBpbnRl
cnJ1cHQgYW5kIGNvdWxkIGZvciBleGFtcGxlIAo+IHVzZSBpdCB0byBpbmZvcm0gdGhlIGNsaWVu
dCBhYm91dCB0aGUgZmF1bHQuCgpUaGlzIHNlZW1zIGxpa2UgYSBuZXcgcmVxdWlyZW1lbnQgYWx0
b2dldGhlciwgYW5kIGEgYml0IG91dHNpZGUgdGhlCnNjb3BlIG9mIHRoaXMgc2VyaWVzIHRvIGJl
IGhvbmVzdC4gSXMgdGhlIGNsaWVudCB1c2Vyc3BhY2UgaW4gdGhpcwpjb250ZXh0PyBJIGd1ZXNz
IGl0IHdvdWxkIGJlIG1vc3RseSBmb3IgcHJvdG90eXBpbmcgdGhlbj8gT3RoZXJ3aXNlIHlvdQpw
cm9iYWJseSBkb24ndCB3YW50IHRvIGhhbmQgR1BVIGNvbnRleHRzIHRvIHVzZXJzcGFjZSB3aXRo
b3V0IGFuIElPTU1VCmlzb2xhdGluZyB0aGVtPwoKSWYgeW91IGRvbid0IG5lZWQgbW0gdHJhY2tp
bmcvc2hhcmluZyBvciBpb21tdV9tYXAvdW5tYXAsIHRoZW4gbWF5YmUgYW4KSURSIHByaXZhdGUg
dG8gdGhlIEdQVSBkcml2ZXIgd291bGQgYmUgZW5vdWdoPyBJZiB5b3UgZG8gbmVlZCBtbQp0cmFj
a2luZywgSSBzdXBwb3NlIHdlIGNvdWxkIG1vZGlmeSBpb21tdV9zdmFfYmluZCgpIHRvIGFsbG9j
YXRlIGFuZAp0cmFjayBpb19tbSBldmVuIGlmIHRoZSBnaXZlbiBkZXZpY2UgZG9lc24ndCBoYXZl
IGFuIElPTU1VLCBidXQgaXQgc2VlbXMKYSBiaXQgYmFja3dhcmQuCgpUaGFua3MsCkplYW4KCl9f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmxpbnV4LWFybS1r
ZXJuZWwgbWFpbGluZyBsaXN0CmxpbnV4LWFybS1rZXJuZWxAbGlzdHMuaW5mcmFkZWFkLm9yZwpo
dHRwOi8vbGlzdHMuaW5mcmFkZWFkLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2xpbnV4LWFybS1rZXJu
ZWwK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-02 16:42                 ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-02 16:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/03/18 08:04, Christian K?nig wrote:
> Am 01.03.2018 um 07:52 schrieb Lu Baolu:
>> Hi Jean,
>>
>> On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote:
>>> [SNIP]
>>> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
>>> +				 dev_param->max_pasid + 1, GFP_ATOMIC);
>> Can the pasid management code be moved into a common library?
>> PASID is not stick to SVA. An IOMMU model device could be designed
>> to use PASID for second level translation (classical DMA translation)
>> as well.
> 
> Yeah, we have the same problem on amdgpu.
> 
> We assign PASIDs to clients even when IOMMU isn't present in the system 
> just because we need it for debugging.
> 
> E.g. when the hardware detects that some shader program is doing 
> something nasty we get the PASID in the interrupt and could for example 
> use it to inform the client about the fault.

This seems like a new requirement altogether, and a bit outside the
scope of this series to be honest. Is the client userspace in this
context? I guess it would be mostly for prototyping then? Otherwise you
probably don't want to hand GPU contexts to userspace without an IOMMU
isolating them?

If you don't need mm tracking/sharing or iommu_map/unmap, then maybe an
IDR private to the GPU driver would be enough? If you do need mm
tracking, I suppose we could modify iommu_sva_bind() to allocate and
track io_mm even if the given device doesn't have an IOMMU, but it seems
a bit backward.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-03-01  6:52         ` Lu Baolu
  (?)
@ 2018-03-02 16:19             ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-02 16:19 UTC (permalink / raw)
  To: Lu Baolu, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w, rjw-LthD3rsA81gm4RdzfppkhA,
	Catalin Marinas, xuzaibo-hv44wF8Li93QT0dZR+AlfA,
	ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A, Will Deacon,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, Sudeep Holla,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	christian.koenig-5C7GfCeVMHo, lenb-DgEjT+Ai2ygdnm+yROfE0A

On 01/03/18 06:52, Lu Baolu wrote:
> Can the pasid management code be moved into a common library?
> PASID is not stick to SVA. An IOMMU model device could be designed
> to use PASID for second level translation (classical DMA translation)
> as well.

What do you mean by second level translation? Do you see a use-case with
nesting translation within the host?

I agree that PASID + classical DMA is desirable. A device driver would
allocate PASIDs and perform iommu_sva_map(domain, pasid, iova, pa, size,
prot) and iommu_sva_unmap(domain, pasid, iova, size). I'm hoping that we
can also augment the DMA API with PASIDs, and that a driver can use both
shared and private contexts simultaneously. So that it can use a few
PASIDs for management purpose, and assign the rest to userspace.

The intent is for iommu-sva.c to be this common library. Work for
"private" PASID allocation is underway, see Jordan Crouse's series
posted last week https://www.spinics.net/lists/arm-kernel/msg635857.html

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-02 16:19             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-02 16:19 UTC (permalink / raw)
  To: Lu Baolu, linux-arm-kernel, linux-pci, linux-acpi, devicetree,
	iommu, kvm
  Cc: Mark Rutland, bharatku, ashok.raj, shunyong.yang, rjw,
	Catalin Marinas, xuzaibo, ilias.apalodimas, Will Deacon, okaya,
	bhelgaas, robh+dt, Sudeep Holla, rfranz, dwmw2, christian.koenig,
	lenb

On 01/03/18 06:52, Lu Baolu wrote:
> Can the pasid management code be moved into a common library?
> PASID is not stick to SVA. An IOMMU model device could be designed
> to use PASID for second level translation (classical DMA translation)
> as well.

What do you mean by second level translation? Do you see a use-case with
nesting translation within the host?

I agree that PASID + classical DMA is desirable. A device driver would
allocate PASIDs and perform iommu_sva_map(domain, pasid, iova, pa, size,
prot) and iommu_sva_unmap(domain, pasid, iova, size). I'm hoping that we
can also augment the DMA API with PASIDs, and that a driver can use both
shared and private contexts simultaneously. So that it can use a few
PASIDs for management purpose, and assign the rest to userspace.

The intent is for iommu-sva.c to be this common library. Work for
"private" PASID allocation is underway, see Jordan Crouse's series
posted last week https://www.spinics.net/lists/arm-kernel/msg635857.html

Thanks,
Jean


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-02 16:19             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-02 16:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/03/18 06:52, Lu Baolu wrote:
> Can the pasid management code be moved into a common library?
> PASID is not stick to SVA. An IOMMU model device could be designed
> to use PASID for second level translation (classical DMA translation)
> as well.

What do you mean by second level translation? Do you see a use-case with
nesting translation within the host?

I agree that PASID + classical DMA is desirable. A device driver would
allocate PASIDs and perform iommu_sva_map(domain, pasid, iova, pa, size,
prot) and iommu_sva_unmap(domain, pasid, iova, size). I'm hoping that we
can also augment the DMA API with PASIDs, and that a driver can use both
shared and private contexts simultaneously. So that it can use a few
PASIDs for management purpose, and assign the rest to userspace.

The intent is for iommu-sva.c to be this common library. Work for
"private" PASID allocation is underway, see Jordan Crouse's series
posted last week https://www.spinics.net/lists/arm-kernel/msg635857.html

Thanks,
Jean

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-03-01  6:52         ` Lu Baolu
  (?)
@ 2018-03-01  8:04             ` Christian König
  -1 siblings, 0 replies; 62+ messages in thread
From: Christian König @ 2018-03-01  8:04 UTC (permalink / raw)
  To: Lu Baolu, Jean-Philippe Brucker,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: mark.rutland-5wv7dgnIgG8, bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w, rjw-LthD3rsA81gm4RdzfppkhA,
	catalin.marinas-5wv7dgnIgG8, xuzaibo-hv44wF8Li93QT0dZR+AlfA,
	ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, sudeep.holla-5wv7dgnIgG8,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	lenb-DgEjT+Ai2ygdnm+yROfE0A

Am 01.03.2018 um 07:52 schrieb Lu Baolu:
> Hi Jean,
>
> On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote:
>> [SNIP]
>> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
>> +				 dev_param->max_pasid + 1, GFP_ATOMIC);
> Can the pasid management code be moved into a common library?
> PASID is not stick to SVA. An IOMMU model device could be designed
> to use PASID for second level translation (classical DMA translation)
> as well.

Yeah, we have the same problem on amdgpu.

We assign PASIDs to clients even when IOMMU isn't present in the system 
just because we need it for debugging.

E.g. when the hardware detects that some shader program is doing 
something nasty we get the PASID in the interrupt and could for example 
use it to inform the client about the fault.

Regards,
Christian.

>
> Best regards,
> Lu Baolu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-01  8:04             ` Christian König
  0 siblings, 0 replies; 62+ messages in thread
From: Christian König @ 2018-03-01  8:04 UTC (permalink / raw)
  To: Lu Baolu, Jean-Philippe Brucker, linux-arm-kernel, linux-pci,
	linux-acpi, devicetree, iommu, kvm
  Cc: mark.rutland, ilias.apalodimas, catalin.marinas, xuzaibo,
	will.deacon, okaya, ashok.raj, bharatku, rfranz, lenb, robh+dt,
	bhelgaas, shunyong.yang, dwmw2, rjw, sudeep.holla

Am 01.03.2018 um 07:52 schrieb Lu Baolu:
> Hi Jean,
>
> On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote:
>> [SNIP]
>> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
>> +				 dev_param->max_pasid + 1, GFP_ATOMIC);
> Can the pasid management code be moved into a common library?
> PASID is not stick to SVA. An IOMMU model device could be designed
> to use PASID for second level translation (classical DMA translation)
> as well.

Yeah, we have the same problem on amdgpu.

We assign PASIDs to clients even when IOMMU isn't present in the system 
just because we need it for debugging.

E.g. when the hardware detects that some shader program is doing 
something nasty we get the PASID in the interrupt and could for example 
use it to inform the client about the fault.

Regards,
Christian.

>
> Best regards,
> Lu Baolu


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-01  8:04             ` Christian König
  0 siblings, 0 replies; 62+ messages in thread
From: Christian König @ 2018-03-01  8:04 UTC (permalink / raw)
  To: linux-arm-kernel

Am 01.03.2018 um 07:52 schrieb Lu Baolu:
> Hi Jean,
>
> On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote:
>> [SNIP]
>> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
>> +				 dev_param->max_pasid + 1, GFP_ATOMIC);
> Can the pasid management code be moved into a common library?
> PASID is not stick to SVA. An IOMMU model device could be designed
> to use PASID for second level translation (classical DMA translation)
> as well.

Yeah, we have the same problem on amdgpu.

We assign PASIDs to clients even when IOMMU isn't present in the system 
just because we need it for debugging.

E.g. when the hardware detects that some shader program is doing 
something nasty we get the PASID in the interrupt and could for example 
use it to inform the client about the fault.

Regards,
Christian.

>
> Best regards,
> Lu Baolu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-02-12 18:33     ` Jean-Philippe Brucker
  (?)
@ 2018-03-01  6:52         ` Lu Baolu
  -1 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2018-03-01  6:52 UTC (permalink / raw)
  To: Jean-Philippe Brucker,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: mark.rutland-5wv7dgnIgG8, bharatku-gjFFaj9aHVfQT0dZR+AlfA,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w, rjw-LthD3rsA81gm4RdzfppkhA,
	catalin.marinas-5wv7dgnIgG8, xuzaibo-hv44wF8Li93QT0dZR+AlfA,
	ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	okaya-sgV2jX0FEOL9JmXXK+q4OQ, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, sudeep.holla-5wv7dgnIgG8,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	christian.koenig-5C7GfCeVMHo, lenb-DgEjT+Ai2ygdnm+yROfE0A

Hi Jean,

On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote:
> Introduce boilerplate code for allocating IOMMU mm structures and binding
> them to devices. Four operations are added to IOMMU drivers:
>
> * mm_alloc(): to create an io_mm structure and perform architecture-
>   specific operations required to grab the process (for instance on ARM,
>   pin down the CPU ASID so that the process doesn't get assigned a new
>   ASID on rollover).
>
>   There is a single valid io_mm structure per Linux mm. Future extensions
>   may also use io_mm for kernel-managed address spaces, populated with
>   map()/unmap() calls instead of bound to process address spaces. This
>   patch focuses on "shared" io_mm.
>
> * mm_attach(): attach an mm to a device. The IOMMU driver checks that the
>   device is capable of sharing an address space, and writes the PASID
>   table entry to install the pgd.
>
>   Some IOMMU drivers will have a single PASID table per domain, for
>   convenience. Other can implement it differently but to help these
>   drivers, mm_attach and mm_detach take 'attach_domain' and
>   'detach_domain' parameters, that tell whether they need to set and clear
>   the PASID entry or only send the required TLB invalidations.
>
> * mm_detach(): detach an mm from a device. The IOMMU driver removes the
>   PASID table entry and invalidates the IOTLBs.
>
> * mm_free(): free a structure allocated by mm_alloc(), and let arch
>   release the process.
>
> mm_attach and mm_detach operations are serialized with a spinlock. At the
> moment it is global, but if we try to optimize it, the core should at
> least prevent concurrent attach()/detach() on the same domain (so
> multi-level PASID table code can allocate tables lazily). mm_alloc() can
> sleep, but mm_free must not (because we'll have to call it from call_srcu
> later on.)
>
> At the moment we use an IDR for allocating PASIDs and retrieving contexts.
> We also use a single spinlock. These can be refined and optimized later (a
> custom allocator will be needed for top-down PASID allocation).
>
> Keeping track of address spaces requires the use of MMU notifiers.
> Handling process exit with regard to unbind() is tricky, so it is left for
> another patch and we explicitly fail mm_alloc() for the moment.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/iommu-sva.c | 382 +++++++++++++++++++++++++++++++++++++++++++++-
>  drivers/iommu/iommu.c     |   2 +
>  include/linux/iommu.h     |  25 +++
>  3 files changed, 406 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
> index 593685d891bf..f9af9d66b3ed 100644
> --- a/drivers/iommu/iommu-sva.c
> +++ b/drivers/iommu/iommu-sva.c
> @@ -7,11 +7,321 @@
>   * SPDX-License-Identifier: GPL-2.0
>   */
>  
> +#include <linux/idr.h>
>  #include <linux/iommu.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +/**
> + * DOC: io_mm model
> + *
> + * The io_mm keeps track of process address spaces shared between CPU and IOMMU.
> + * The following example illustrates the relation between structures
> + * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and
> + * device. A device can have multiple io_mm and an io_mm may be bound to
> + * multiple devices.
> + *              ___________________________
> + *             |  IOMMU domain A           |
> + *             |  ________________         |
> + *             | |  IOMMU group   |        +------- io_pgtables
> + *             | |                |        |
> + *             | |   dev 00:00.0 ----+------- bond --- io_mm X
> + *             | |________________|   \    |
> + *             |                       '----- bond ---.
> + *             |___________________________|           \
> + *              ___________________________             \
> + *             |  IOMMU domain B           |           io_mm Y
> + *             |  ________________         |           / /
> + *             | |  IOMMU group   |        |          / /
> + *             | |                |        |         / /
> + *             | |   dev 00:01.0 ------------ bond -' /
> + *             | |   dev 00:01.1 ------------ bond --'
> + *             | |________________|        |
> + *             |                           +------- io_pgtables
> + *             |___________________________|
> + *
> + * In this example, device 00:00.0 is in domain A, devices 00:01.* are in domain
> + * B. All devices within the same domain access the same address spaces. Device
> + * 00:00.0 accesses address spaces X and Y, each corresponding to an mm_struct.
> + * Devices 00:01.* only access address space Y. In addition each
> + * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is
> + * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU.
> + *
> + * To obtain the above configuration, users would for instance issue the
> + * following calls:
> + *
> + *     iommu_sva_bind_device(dev 00:00.0, mm X, ...) -> PASID 1
> + *     iommu_sva_bind_device(dev 00:00.0, mm Y, ...) -> PASID 2
> + *     iommu_sva_bind_device(dev 00:01.0, mm Y, ...) -> PASID 2
> + *     iommu_sva_bind_device(dev 00:01.1, mm Y, ...) -> PASID 2
> + *
> + * A single Process Address Space ID (PASID) is allocated for each mm. In the
> + * example, devices use PASID 1 to read/write into address space X and PASID 2
> + * to read/write into address space Y.
> + *
> + * Hardware tables describing this configuration in the IOMMU would typically
> + * look like this:
> + *
> + *                                PASID tables
> + *                                 of domain A
> + *                              .->+--------+
> + *                             / 0 |        |-------> io_pgtable
> + *                            /    +--------+
> + *            Device tables  /   1 |        |-------> pgd X
> + *              +--------+  /      +--------+
> + *      00:00.0 |      A |-'     2 |        |--.
> + *              +--------+         +--------+   \
> + *              :        :       3 |        |    \
> + *              +--------+         +--------+     --> pgd Y
> + *      00:01.0 |      B |--.                    /
> + *              +--------+   \                  |
> + *      00:01.1 |      B |----+   PASID tables  |
> + *              +--------+     \   of domain B  |
> + *                              '->+--------+   |
> + *                               0 |        |-- | --> io_pgtable
> + *                                 +--------+   |
> + *                               1 |        |   |
> + *                                 +--------+   |
> + *                               2 |        |---'
> + *                                 +--------+
> + *                               3 |        |
> + *                                 +--------+
> + *
> + * With this model, a single call binds all devices in a given domain to an
> + * address space. Other devices in the domain will get the same bond implicitly.
> + * However, users must issue one bind() for each device, because IOMMUs may
> + * implement SVA differently. Furthermore, mandating one bind() per device
> + * allows the driver to perform sanity-checks on device capabilities.
> + *
> + * On Arm and AMD IOMMUs, entry 0 of the PASID table can be used to hold
> + * non-PASID translations. In this case PASID 0 is reserved and entry 0 points
> + * to the io_pgtable base. On Intel IOMMU, the io_pgtable base would be held in
> + * the device table and PASID 0 would be available to the allocator.
> + */
>  
>  /* TODO: stub for the fault queue. Remove later. */
>  #define iommu_fault_queue_flush(...)
>  
> +struct iommu_bond {
> +	struct io_mm		*io_mm;
> +	struct device		*dev;
> +	struct iommu_domain	*domain;
> +
> +	struct list_head	mm_head;
> +	struct list_head	dev_head;
> +	struct list_head	domain_head;
> +
> +	void			*drvdata;
> +
> +	/* Number of bind() calls */
> +	refcount_t		refs;
> +};
> +
> +/*
> + * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is
> + * used for returning errors). In practice implementations will use at most 20
> + * bits, which is the PCI limit.
> + */
> +static DEFINE_IDR(iommu_pasid_idr);
> +
> +/*
> + * For the moment this is an all-purpose lock. It serializes
> + * access/modifications to bonds, access/modifications to the PASID IDR, and
> + * changes to io_mm refcount as well.
> + */
> +static DEFINE_SPINLOCK(iommu_sva_lock);
> +
> +static struct io_mm *
> +io_mm_alloc(struct iommu_domain *domain, struct device *dev,
> +	    struct mm_struct *mm)
> +{
> +	int ret;
> +	int pasid;
> +	struct io_mm *io_mm;
> +	struct iommu_param *dev_param = dev->iommu_param;
> +
> +	if (!dev_param || !domain->ops->mm_alloc || !domain->ops->mm_free)
> +		return ERR_PTR(-ENODEV);
> +
> +	io_mm = domain->ops->mm_alloc(domain, mm);
> +	if (IS_ERR(io_mm))
> +		return io_mm;
> +	if (!io_mm)
> +		return ERR_PTR(-ENOMEM);
> +
> +	/*
> +	 * The mm must not be freed until after the driver frees the io_mm
> +	 * (which may involve unpinning the CPU ASID for instance, requiring a
> +	 * valid mm struct.)
> +	 */
> +	mmgrab(mm);
> +
> +	io_mm->mm		= mm;
> +	io_mm->release		= domain->ops->mm_free;
> +	INIT_LIST_HEAD(&io_mm->devices);
> +
> +	idr_preload(GFP_KERNEL);
> +	spin_lock(&iommu_sva_lock);
> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
> +				 dev_param->max_pasid + 1, GFP_ATOMIC);

Can the pasid management code be moved into a common library?
PASID is not stick to SVA. An IOMMU model device could be designed
to use PASID for second level translation (classical DMA translation)
as well.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-01  6:52         ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2018-03-01  6:52 UTC (permalink / raw)
  To: Jean-Philippe Brucker, linux-arm-kernel, linux-pci, linux-acpi,
	devicetree, iommu, kvm
  Cc: mark.rutland, ilias.apalodimas, catalin.marinas, xuzaibo,
	will.deacon, okaya, ashok.raj, bharatku, rfranz, lenb, robh+dt,
	bhelgaas, shunyong.yang, dwmw2, rjw, sudeep.holla,
	christian.koenig

Hi Jean,

On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote:
> Introduce boilerplate code for allocating IOMMU mm structures and binding
> them to devices. Four operations are added to IOMMU drivers:
>
> * mm_alloc(): to create an io_mm structure and perform architecture-
>   specific operations required to grab the process (for instance on ARM,
>   pin down the CPU ASID so that the process doesn't get assigned a new
>   ASID on rollover).
>
>   There is a single valid io_mm structure per Linux mm. Future extensions
>   may also use io_mm for kernel-managed address spaces, populated with
>   map()/unmap() calls instead of bound to process address spaces. This
>   patch focuses on "shared" io_mm.
>
> * mm_attach(): attach an mm to a device. The IOMMU driver checks that the
>   device is capable of sharing an address space, and writes the PASID
>   table entry to install the pgd.
>
>   Some IOMMU drivers will have a single PASID table per domain, for
>   convenience. Other can implement it differently but to help these
>   drivers, mm_attach and mm_detach take 'attach_domain' and
>   'detach_domain' parameters, that tell whether they need to set and clear
>   the PASID entry or only send the required TLB invalidations.
>
> * mm_detach(): detach an mm from a device. The IOMMU driver removes the
>   PASID table entry and invalidates the IOTLBs.
>
> * mm_free(): free a structure allocated by mm_alloc(), and let arch
>   release the process.
>
> mm_attach and mm_detach operations are serialized with a spinlock. At the
> moment it is global, but if we try to optimize it, the core should at
> least prevent concurrent attach()/detach() on the same domain (so
> multi-level PASID table code can allocate tables lazily). mm_alloc() can
> sleep, but mm_free must not (because we'll have to call it from call_srcu
> later on.)
>
> At the moment we use an IDR for allocating PASIDs and retrieving contexts.
> We also use a single spinlock. These can be refined and optimized later (a
> custom allocator will be needed for top-down PASID allocation).
>
> Keeping track of address spaces requires the use of MMU notifiers.
> Handling process exit with regard to unbind() is tricky, so it is left for
> another patch and we explicitly fail mm_alloc() for the moment.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> ---
>  drivers/iommu/iommu-sva.c | 382 +++++++++++++++++++++++++++++++++++++++++++++-
>  drivers/iommu/iommu.c     |   2 +
>  include/linux/iommu.h     |  25 +++
>  3 files changed, 406 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
> index 593685d891bf..f9af9d66b3ed 100644
> --- a/drivers/iommu/iommu-sva.c
> +++ b/drivers/iommu/iommu-sva.c
> @@ -7,11 +7,321 @@
>   * SPDX-License-Identifier: GPL-2.0
>   */
>  
> +#include <linux/idr.h>
>  #include <linux/iommu.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +/**
> + * DOC: io_mm model
> + *
> + * The io_mm keeps track of process address spaces shared between CPU and IOMMU.
> + * The following example illustrates the relation between structures
> + * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and
> + * device. A device can have multiple io_mm and an io_mm may be bound to
> + * multiple devices.
> + *              ___________________________
> + *             |  IOMMU domain A           |
> + *             |  ________________         |
> + *             | |  IOMMU group   |        +------- io_pgtables
> + *             | |                |        |
> + *             | |   dev 00:00.0 ----+------- bond --- io_mm X
> + *             | |________________|   \    |
> + *             |                       '----- bond ---.
> + *             |___________________________|           \
> + *              ___________________________             \
> + *             |  IOMMU domain B           |           io_mm Y
> + *             |  ________________         |           / /
> + *             | |  IOMMU group   |        |          / /
> + *             | |                |        |         / /
> + *             | |   dev 00:01.0 ------------ bond -' /
> + *             | |   dev 00:01.1 ------------ bond --'
> + *             | |________________|        |
> + *             |                           +------- io_pgtables
> + *             |___________________________|
> + *
> + * In this example, device 00:00.0 is in domain A, devices 00:01.* are in domain
> + * B. All devices within the same domain access the same address spaces. Device
> + * 00:00.0 accesses address spaces X and Y, each corresponding to an mm_struct.
> + * Devices 00:01.* only access address space Y. In addition each
> + * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is
> + * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU.
> + *
> + * To obtain the above configuration, users would for instance issue the
> + * following calls:
> + *
> + *     iommu_sva_bind_device(dev 00:00.0, mm X, ...) -> PASID 1
> + *     iommu_sva_bind_device(dev 00:00.0, mm Y, ...) -> PASID 2
> + *     iommu_sva_bind_device(dev 00:01.0, mm Y, ...) -> PASID 2
> + *     iommu_sva_bind_device(dev 00:01.1, mm Y, ...) -> PASID 2
> + *
> + * A single Process Address Space ID (PASID) is allocated for each mm. In the
> + * example, devices use PASID 1 to read/write into address space X and PASID 2
> + * to read/write into address space Y.
> + *
> + * Hardware tables describing this configuration in the IOMMU would typically
> + * look like this:
> + *
> + *                                PASID tables
> + *                                 of domain A
> + *                              .->+--------+
> + *                             / 0 |        |-------> io_pgtable
> + *                            /    +--------+
> + *            Device tables  /   1 |        |-------> pgd X
> + *              +--------+  /      +--------+
> + *      00:00.0 |      A |-'     2 |        |--.
> + *              +--------+         +--------+   \
> + *              :        :       3 |        |    \
> + *              +--------+         +--------+     --> pgd Y
> + *      00:01.0 |      B |--.                    /
> + *              +--------+   \                  |
> + *      00:01.1 |      B |----+   PASID tables  |
> + *              +--------+     \   of domain B  |
> + *                              '->+--------+   |
> + *                               0 |        |-- | --> io_pgtable
> + *                                 +--------+   |
> + *                               1 |        |   |
> + *                                 +--------+   |
> + *                               2 |        |---'
> + *                                 +--------+
> + *                               3 |        |
> + *                                 +--------+
> + *
> + * With this model, a single call binds all devices in a given domain to an
> + * address space. Other devices in the domain will get the same bond implicitly.
> + * However, users must issue one bind() for each device, because IOMMUs may
> + * implement SVA differently. Furthermore, mandating one bind() per device
> + * allows the driver to perform sanity-checks on device capabilities.
> + *
> + * On Arm and AMD IOMMUs, entry 0 of the PASID table can be used to hold
> + * non-PASID translations. In this case PASID 0 is reserved and entry 0 points
> + * to the io_pgtable base. On Intel IOMMU, the io_pgtable base would be held in
> + * the device table and PASID 0 would be available to the allocator.
> + */
>  
>  /* TODO: stub for the fault queue. Remove later. */
>  #define iommu_fault_queue_flush(...)
>  
> +struct iommu_bond {
> +	struct io_mm		*io_mm;
> +	struct device		*dev;
> +	struct iommu_domain	*domain;
> +
> +	struct list_head	mm_head;
> +	struct list_head	dev_head;
> +	struct list_head	domain_head;
> +
> +	void			*drvdata;
> +
> +	/* Number of bind() calls */
> +	refcount_t		refs;
> +};
> +
> +/*
> + * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is
> + * used for returning errors). In practice implementations will use at most 20
> + * bits, which is the PCI limit.
> + */
> +static DEFINE_IDR(iommu_pasid_idr);
> +
> +/*
> + * For the moment this is an all-purpose lock. It serializes
> + * access/modifications to bonds, access/modifications to the PASID IDR, and
> + * changes to io_mm refcount as well.
> + */
> +static DEFINE_SPINLOCK(iommu_sva_lock);
> +
> +static struct io_mm *
> +io_mm_alloc(struct iommu_domain *domain, struct device *dev,
> +	    struct mm_struct *mm)
> +{
> +	int ret;
> +	int pasid;
> +	struct io_mm *io_mm;
> +	struct iommu_param *dev_param = dev->iommu_param;
> +
> +	if (!dev_param || !domain->ops->mm_alloc || !domain->ops->mm_free)
> +		return ERR_PTR(-ENODEV);
> +
> +	io_mm = domain->ops->mm_alloc(domain, mm);
> +	if (IS_ERR(io_mm))
> +		return io_mm;
> +	if (!io_mm)
> +		return ERR_PTR(-ENOMEM);
> +
> +	/*
> +	 * The mm must not be freed until after the driver frees the io_mm
> +	 * (which may involve unpinning the CPU ASID for instance, requiring a
> +	 * valid mm struct.)
> +	 */
> +	mmgrab(mm);
> +
> +	io_mm->mm		= mm;
> +	io_mm->release		= domain->ops->mm_free;
> +	INIT_LIST_HEAD(&io_mm->devices);
> +
> +	idr_preload(GFP_KERNEL);
> +	spin_lock(&iommu_sva_lock);
> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
> +				 dev_param->max_pasid + 1, GFP_ATOMIC);

Can the pasid management code be moved into a common library?
PASID is not stick to SVA. An IOMMU model device could be designed
to use PASID for second level translation (classical DMA translation)
as well.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-03-01  6:52         ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2018-03-01  6:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jean,

On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote:
> Introduce boilerplate code for allocating IOMMU mm structures and binding
> them to devices. Four operations are added to IOMMU drivers:
>
> * mm_alloc(): to create an io_mm structure and perform architecture-
>   specific operations required to grab the process (for instance on ARM,
>   pin down the CPU ASID so that the process doesn't get assigned a new
>   ASID on rollover).
>
>   There is a single valid io_mm structure per Linux mm. Future extensions
>   may also use io_mm for kernel-managed address spaces, populated with
>   map()/unmap() calls instead of bound to process address spaces. This
>   patch focuses on "shared" io_mm.
>
> * mm_attach(): attach an mm to a device. The IOMMU driver checks that the
>   device is capable of sharing an address space, and writes the PASID
>   table entry to install the pgd.
>
>   Some IOMMU drivers will have a single PASID table per domain, for
>   convenience. Other can implement it differently but to help these
>   drivers, mm_attach and mm_detach take 'attach_domain' and
>   'detach_domain' parameters, that tell whether they need to set and clear
>   the PASID entry or only send the required TLB invalidations.
>
> * mm_detach(): detach an mm from a device. The IOMMU driver removes the
>   PASID table entry and invalidates the IOTLBs.
>
> * mm_free(): free a structure allocated by mm_alloc(), and let arch
>   release the process.
>
> mm_attach and mm_detach operations are serialized with a spinlock. At the
> moment it is global, but if we try to optimize it, the core should at
> least prevent concurrent attach()/detach() on the same domain (so
> multi-level PASID table code can allocate tables lazily). mm_alloc() can
> sleep, but mm_free must not (because we'll have to call it from call_srcu
> later on.)
>
> At the moment we use an IDR for allocating PASIDs and retrieving contexts.
> We also use a single spinlock. These can be refined and optimized later (a
> custom allocator will be needed for top-down PASID allocation).
>
> Keeping track of address spaces requires the use of MMU notifiers.
> Handling process exit with regard to unbind() is tricky, so it is left for
> another patch and we explicitly fail mm_alloc() for the moment.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> ---
>  drivers/iommu/iommu-sva.c | 382 +++++++++++++++++++++++++++++++++++++++++++++-
>  drivers/iommu/iommu.c     |   2 +
>  include/linux/iommu.h     |  25 +++
>  3 files changed, 406 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
> index 593685d891bf..f9af9d66b3ed 100644
> --- a/drivers/iommu/iommu-sva.c
> +++ b/drivers/iommu/iommu-sva.c
> @@ -7,11 +7,321 @@
>   * SPDX-License-Identifier: GPL-2.0
>   */
>  
> +#include <linux/idr.h>
>  #include <linux/iommu.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +/**
> + * DOC: io_mm model
> + *
> + * The io_mm keeps track of process address spaces shared between CPU and IOMMU.
> + * The following example illustrates the relation between structures
> + * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and
> + * device. A device can have multiple io_mm and an io_mm may be bound to
> + * multiple devices.
> + *              ___________________________
> + *             |  IOMMU domain A           |
> + *             |  ________________         |
> + *             | |  IOMMU group   |        +------- io_pgtables
> + *             | |                |        |
> + *             | |   dev 00:00.0 ----+------- bond --- io_mm X
> + *             | |________________|   \    |
> + *             |                       '----- bond ---.
> + *             |___________________________|           \
> + *              ___________________________             \
> + *             |  IOMMU domain B           |           io_mm Y
> + *             |  ________________         |           / /
> + *             | |  IOMMU group   |        |          / /
> + *             | |                |        |         / /
> + *             | |   dev 00:01.0 ------------ bond -' /
> + *             | |   dev 00:01.1 ------------ bond --'
> + *             | |________________|        |
> + *             |                           +------- io_pgtables
> + *             |___________________________|
> + *
> + * In this example, device 00:00.0 is in domain A, devices 00:01.* are in domain
> + * B. All devices within the same domain access the same address spaces. Device
> + * 00:00.0 accesses address spaces X and Y, each corresponding to an mm_struct.
> + * Devices 00:01.* only access address space Y. In addition each
> + * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is
> + * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU.
> + *
> + * To obtain the above configuration, users would for instance issue the
> + * following calls:
> + *
> + *     iommu_sva_bind_device(dev 00:00.0, mm X, ...) -> PASID 1
> + *     iommu_sva_bind_device(dev 00:00.0, mm Y, ...) -> PASID 2
> + *     iommu_sva_bind_device(dev 00:01.0, mm Y, ...) -> PASID 2
> + *     iommu_sva_bind_device(dev 00:01.1, mm Y, ...) -> PASID 2
> + *
> + * A single Process Address Space ID (PASID) is allocated for each mm. In the
> + * example, devices use PASID 1 to read/write into address space X and PASID 2
> + * to read/write into address space Y.
> + *
> + * Hardware tables describing this configuration in the IOMMU would typically
> + * look like this:
> + *
> + *                                PASID tables
> + *                                 of domain A
> + *                              .->+--------+
> + *                             / 0 |        |-------> io_pgtable
> + *                            /    +--------+
> + *            Device tables  /   1 |        |-------> pgd X
> + *              +--------+  /      +--------+
> + *      00:00.0 |      A |-'     2 |        |--.
> + *              +--------+         +--------+   \
> + *              :        :       3 |        |    \
> + *              +--------+         +--------+     --> pgd Y
> + *      00:01.0 |      B |--.                    /
> + *              +--------+   \                  |
> + *      00:01.1 |      B |----+   PASID tables  |
> + *              +--------+     \   of domain B  |
> + *                              '->+--------+   |
> + *                               0 |        |-- | --> io_pgtable
> + *                                 +--------+   |
> + *                               1 |        |   |
> + *                                 +--------+   |
> + *                               2 |        |---'
> + *                                 +--------+
> + *                               3 |        |
> + *                                 +--------+
> + *
> + * With this model, a single call binds all devices in a given domain to an
> + * address space. Other devices in the domain will get the same bond implicitly.
> + * However, users must issue one bind() for each device, because IOMMUs may
> + * implement SVA differently. Furthermore, mandating one bind() per device
> + * allows the driver to perform sanity-checks on device capabilities.
> + *
> + * On Arm and AMD IOMMUs, entry 0 of the PASID table can be used to hold
> + * non-PASID translations. In this case PASID 0 is reserved and entry 0 points
> + * to the io_pgtable base. On Intel IOMMU, the io_pgtable base would be held in
> + * the device table and PASID 0 would be available to the allocator.
> + */
>  
>  /* TODO: stub for the fault queue. Remove later. */
>  #define iommu_fault_queue_flush(...)
>  
> +struct iommu_bond {
> +	struct io_mm		*io_mm;
> +	struct device		*dev;
> +	struct iommu_domain	*domain;
> +
> +	struct list_head	mm_head;
> +	struct list_head	dev_head;
> +	struct list_head	domain_head;
> +
> +	void			*drvdata;
> +
> +	/* Number of bind() calls */
> +	refcount_t		refs;
> +};
> +
> +/*
> + * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is
> + * used for returning errors). In practice implementations will use at most 20
> + * bits, which is the PCI limit.
> + */
> +static DEFINE_IDR(iommu_pasid_idr);
> +
> +/*
> + * For the moment this is an all-purpose lock. It serializes
> + * access/modifications to bonds, access/modifications to the PASID IDR, and
> + * changes to io_mm refcount as well.
> + */
> +static DEFINE_SPINLOCK(iommu_sva_lock);
> +
> +static struct io_mm *
> +io_mm_alloc(struct iommu_domain *domain, struct device *dev,
> +	    struct mm_struct *mm)
> +{
> +	int ret;
> +	int pasid;
> +	struct io_mm *io_mm;
> +	struct iommu_param *dev_param = dev->iommu_param;
> +
> +	if (!dev_param || !domain->ops->mm_alloc || !domain->ops->mm_free)
> +		return ERR_PTR(-ENODEV);
> +
> +	io_mm = domain->ops->mm_alloc(domain, mm);
> +	if (IS_ERR(io_mm))
> +		return io_mm;
> +	if (!io_mm)
> +		return ERR_PTR(-ENOMEM);
> +
> +	/*
> +	 * The mm must not be freed until after the driver frees the io_mm
> +	 * (which may involve unpinning the CPU ASID for instance, requiring a
> +	 * valid mm struct.)
> +	 */
> +	mmgrab(mm);
> +
> +	io_mm->mm		= mm;
> +	io_mm->release		= domain->ops->mm_free;
> +	INIT_LIST_HEAD(&io_mm->devices);
> +
> +	idr_preload(GFP_KERNEL);
> +	spin_lock(&iommu_sva_lock);
> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
> +				 dev_param->max_pasid + 1, GFP_ATOMIC);

Can the pasid management code be moved into a common library?
PASID is not stick to SVA. An IOMMU model device could be designed
to use PASID for second level translation (classical DMA translation)
as well.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
  2018-02-12 18:33 [PATCH 00/37] Shared Virtual Addressing for the IOMMU Jean-Philippe Brucker
       [not found] ` <20180212183352.22730-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
@ 2018-02-12 18:33     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-02-12 18:33 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: joro-zLv9SwRftAIdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, catalin.marinas-5wv7dgnIgG8,
	will.deacon-5wv7dgnIgG8, lorenzo.pieralisi-5wv7dgnIgG8,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A, sudeep.holla-5wv7dgnIgG8,
	rjw-LthD3rsA81gm4RdzfppkhA, lenb-DgEjT+Ai2ygdnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	tn-nYOzD4b6Jr9Wk0Htik3J/w, liubo95-hv44wF8Li93QT0dZR+AlfA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	xieyisheng1-hv44wF8Li93QT0dZR+AlfA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA,
	ilias.apalodimas-QSEj5FYQhm4dnm+yROfE0A,
	jonathan.cameron-hv44wF8Li93QT0dZR+AlfA,
	shunyong.yang-PT9Dzx9SjPiXmMXjJBpWqg,
	nwatters-sgV2jX0FEOL9JmXXK+q4OQ, okaya-sgV2jX0FEOL9JmXXK+q4OQ,
	jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA,
	yi.l.liu-ral2JQCrhuEAvxtiuMwx3w,
	ashok.raj-ral2JQCrhuEAvxtiuMwx3w,
	robdclark-Re5JQEeQqe8AvxtiuMwx3w, christian.koenig-5C7GfCeVMHo,
	bharatku-gjFFaj9aHVfQT0dZR+AlfA

Introduce boilerplate code for allocating IOMMU mm structures and binding
them to devices. Four operations are added to IOMMU drivers:

* mm_alloc(): to create an io_mm structure and perform architecture-
  specific operations required to grab the process (for instance on ARM,
  pin down the CPU ASID so that the process doesn't get assigned a new
  ASID on rollover).

  There is a single valid io_mm structure per Linux mm. Future extensions
  may also use io_mm for kernel-managed address spaces, populated with
  map()/unmap() calls instead of bound to process address spaces. This
  patch focuses on "shared" io_mm.

* mm_attach(): attach an mm to a device. The IOMMU driver checks that the
  device is capable of sharing an address space, and writes the PASID
  table entry to install the pgd.

  Some IOMMU drivers will have a single PASID table per domain, for
  convenience. Other can implement it differently but to help these
  drivers, mm_attach and mm_detach take 'attach_domain' and
  'detach_domain' parameters, that tell whether they need to set and clear
  the PASID entry or only send the required TLB invalidations.

* mm_detach(): detach an mm from a device. The IOMMU driver removes the
  PASID table entry and invalidates the IOTLBs.

* mm_free(): free a structure allocated by mm_alloc(), and let arch
  release the process.

mm_attach and mm_detach operations are serialized with a spinlock. At the
moment it is global, but if we try to optimize it, the core should at
least prevent concurrent attach()/detach() on the same domain (so
multi-level PASID table code can allocate tables lazily). mm_alloc() can
sleep, but mm_free must not (because we'll have to call it from call_srcu
later on.)

At the moment we use an IDR for allocating PASIDs and retrieving contexts.
We also use a single spinlock. These can be refined and optimized later (a
custom allocator will be needed for top-down PASID allocation).

Keeping track of address spaces requires the use of MMU notifiers.
Handling process exit with regard to unbind() is tricky, so it is left for
another patch and we explicitly fail mm_alloc() for the moment.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/iommu-sva.c | 382 +++++++++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/iommu.c     |   2 +
 include/linux/iommu.h     |  25 +++
 3 files changed, 406 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 593685d891bf..f9af9d66b3ed 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -7,11 +7,321 @@
  * SPDX-License-Identifier: GPL-2.0
  */
 
+#include <linux/idr.h>
 #include <linux/iommu.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/**
+ * DOC: io_mm model
+ *
+ * The io_mm keeps track of process address spaces shared between CPU and IOMMU.
+ * The following example illustrates the relation between structures
+ * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and
+ * device. A device can have multiple io_mm and an io_mm may be bound to
+ * multiple devices.
+ *              ___________________________
+ *             |  IOMMU domain A           |
+ *             |  ________________         |
+ *             | |  IOMMU group   |        +------- io_pgtables
+ *             | |                |        |
+ *             | |   dev 00:00.0 ----+------- bond --- io_mm X
+ *             | |________________|   \    |
+ *             |                       '----- bond ---.
+ *             |___________________________|           \
+ *              ___________________________             \
+ *             |  IOMMU domain B           |           io_mm Y
+ *             |  ________________         |           / /
+ *             | |  IOMMU group   |        |          / /
+ *             | |                |        |         / /
+ *             | |   dev 00:01.0 ------------ bond -' /
+ *             | |   dev 00:01.1 ------------ bond --'
+ *             | |________________|        |
+ *             |                           +------- io_pgtables
+ *             |___________________________|
+ *
+ * In this example, device 00:00.0 is in domain A, devices 00:01.* are in domain
+ * B. All devices within the same domain access the same address spaces. Device
+ * 00:00.0 accesses address spaces X and Y, each corresponding to an mm_struct.
+ * Devices 00:01.* only access address space Y. In addition each
+ * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is
+ * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU.
+ *
+ * To obtain the above configuration, users would for instance issue the
+ * following calls:
+ *
+ *     iommu_sva_bind_device(dev 00:00.0, mm X, ...) -> PASID 1
+ *     iommu_sva_bind_device(dev 00:00.0, mm Y, ...) -> PASID 2
+ *     iommu_sva_bind_device(dev 00:01.0, mm Y, ...) -> PASID 2
+ *     iommu_sva_bind_device(dev 00:01.1, mm Y, ...) -> PASID 2
+ *
+ * A single Process Address Space ID (PASID) is allocated for each mm. In the
+ * example, devices use PASID 1 to read/write into address space X and PASID 2
+ * to read/write into address space Y.
+ *
+ * Hardware tables describing this configuration in the IOMMU would typically
+ * look like this:
+ *
+ *                                PASID tables
+ *                                 of domain A
+ *                              .->+--------+
+ *                             / 0 |        |-------> io_pgtable
+ *                            /    +--------+
+ *            Device tables  /   1 |        |-------> pgd X
+ *              +--------+  /      +--------+
+ *      00:00.0 |      A |-'     2 |        |--.
+ *              +--------+         +--------+   \
+ *              :        :       3 |        |    \
+ *              +--------+         +--------+     --> pgd Y
+ *      00:01.0 |      B |--.                    /
+ *              +--------+   \                  |
+ *      00:01.1 |      B |----+   PASID tables  |
+ *              +--------+     \   of domain B  |
+ *                              '->+--------+   |
+ *                               0 |        |-- | --> io_pgtable
+ *                                 +--------+   |
+ *                               1 |        |   |
+ *                                 +--------+   |
+ *                               2 |        |---'
+ *                                 +--------+
+ *                               3 |        |
+ *                                 +--------+
+ *
+ * With this model, a single call binds all devices in a given domain to an
+ * address space. Other devices in the domain will get the same bond implicitly.
+ * However, users must issue one bind() for each device, because IOMMUs may
+ * implement SVA differently. Furthermore, mandating one bind() per device
+ * allows the driver to perform sanity-checks on device capabilities.
+ *
+ * On Arm and AMD IOMMUs, entry 0 of the PASID table can be used to hold
+ * non-PASID translations. In this case PASID 0 is reserved and entry 0 points
+ * to the io_pgtable base. On Intel IOMMU, the io_pgtable base would be held in
+ * the device table and PASID 0 would be available to the allocator.
+ */
 
 /* TODO: stub for the fault queue. Remove later. */
 #define iommu_fault_queue_flush(...)
 
+struct iommu_bond {
+	struct io_mm		*io_mm;
+	struct device		*dev;
+	struct iommu_domain	*domain;
+
+	struct list_head	mm_head;
+	struct list_head	dev_head;
+	struct list_head	domain_head;
+
+	void			*drvdata;
+
+	/* Number of bind() calls */
+	refcount_t		refs;
+};
+
+/*
+ * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is
+ * used for returning errors). In practice implementations will use at most 20
+ * bits, which is the PCI limit.
+ */
+static DEFINE_IDR(iommu_pasid_idr);
+
+/*
+ * For the moment this is an all-purpose lock. It serializes
+ * access/modifications to bonds, access/modifications to the PASID IDR, and
+ * changes to io_mm refcount as well.
+ */
+static DEFINE_SPINLOCK(iommu_sva_lock);
+
+static struct io_mm *
+io_mm_alloc(struct iommu_domain *domain, struct device *dev,
+	    struct mm_struct *mm)
+{
+	int ret;
+	int pasid;
+	struct io_mm *io_mm;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param || !domain->ops->mm_alloc || !domain->ops->mm_free)
+		return ERR_PTR(-ENODEV);
+
+	io_mm = domain->ops->mm_alloc(domain, mm);
+	if (IS_ERR(io_mm))
+		return io_mm;
+	if (!io_mm)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * The mm must not be freed until after the driver frees the io_mm
+	 * (which may involve unpinning the CPU ASID for instance, requiring a
+	 * valid mm struct.)
+	 */
+	mmgrab(mm);
+
+	io_mm->mm		= mm;
+	io_mm->release		= domain->ops->mm_free;
+	INIT_LIST_HEAD(&io_mm->devices);
+
+	idr_preload(GFP_KERNEL);
+	spin_lock(&iommu_sva_lock);
+	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
+				 dev_param->max_pasid + 1, GFP_ATOMIC);
+	io_mm->pasid = pasid;
+	spin_unlock(&iommu_sva_lock);
+	idr_preload_end();
+
+	if (pasid < 0) {
+		ret = pasid;
+		goto err_free_mm;
+	}
+
+	/* TODO: keep track of mm. For the moment, abort. */
+	ret = -ENOSYS;
+	spin_lock(&iommu_sva_lock);
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	spin_unlock(&iommu_sva_lock);
+
+err_free_mm:
+	domain->ops->mm_free(io_mm);
+	mmdrop(mm);
+
+	return ERR_PTR(ret);
+}
+
+static void io_mm_free(struct io_mm *io_mm)
+{
+	struct mm_struct *mm;
+	void (*release)(struct io_mm *);
+
+	release = io_mm->release;
+	mm = io_mm->mm;
+
+	release(io_mm);
+	mmdrop(mm);
+}
+
+static void io_mm_release(struct kref *kref)
+{
+	struct io_mm *io_mm;
+
+	io_mm = container_of(kref, struct io_mm, kref);
+	WARN_ON(!list_empty(&io_mm->devices));
+
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+
+	io_mm_free(io_mm);
+}
+
+/*
+ * Returns non-zero if a reference to the io_mm was successfully taken.
+ * Returns zero if the io_mm is being freed and should not be used.
+ */
+static int io_mm_get_locked(struct io_mm *io_mm)
+{
+	if (io_mm)
+		return kref_get_unless_zero(&io_mm->kref);
+
+	return 0;
+}
+
+static void io_mm_put_locked(struct io_mm *io_mm)
+{
+	kref_put(&io_mm->kref, io_mm_release);
+}
+
+static void io_mm_put(struct io_mm *io_mm)
+{
+	spin_lock(&iommu_sva_lock);
+	io_mm_put_locked(io_mm);
+	spin_unlock(&iommu_sva_lock);
+}
+
+static int io_mm_attach(struct iommu_domain *domain, struct device *dev,
+			struct io_mm *io_mm, void *drvdata)
+{
+	int ret;
+	bool attach_domain = true;
+	int pasid = io_mm->pasid;
+	struct iommu_bond *bond, *tmp;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param)
+		return -EINVAL;
+
+	if (!domain->ops->mm_attach || !domain->ops->mm_detach)
+		return -ENODEV;
+
+	if (pasid > dev_param->max_pasid || pasid < dev_param->min_pasid)
+		return -ERANGE;
+
+	bond = kzalloc(sizeof(*bond), GFP_KERNEL);
+	if (!bond)
+		return -ENOMEM;
+
+	bond->domain		= domain;
+	bond->io_mm		= io_mm;
+	bond->dev		= dev;
+	bond->drvdata		= drvdata;
+	refcount_set(&bond->refs, 1);
+
+	spin_lock(&iommu_sva_lock);
+	/*
+	 * Check if this io_mm is already bound to the domain. In which case the
+	 * IOMMU driver doesn't have to install the PASID table entry.
+	 */
+	list_for_each_entry(tmp, &domain->mm_list, domain_head) {
+		if (tmp->io_mm == io_mm) {
+			attach_domain = false;
+			break;
+		}
+	}
+
+	ret = domain->ops->mm_attach(domain, dev, io_mm, attach_domain);
+	if (ret) {
+		kfree(bond);
+		spin_unlock(&iommu_sva_lock);
+		return ret;
+	}
+
+	list_add(&bond->mm_head, &io_mm->devices);
+	list_add(&bond->domain_head, &domain->mm_list);
+	list_add(&bond->dev_head, &dev_param->mm_list);
+	spin_unlock(&iommu_sva_lock);
+
+	return 0;
+}
+
+static bool io_mm_detach_locked(struct iommu_bond *bond)
+{
+	struct iommu_bond *tmp;
+	bool detach_domain = true;
+	struct iommu_domain *domain = bond->domain;
+
+	if (!refcount_dec_and_test(&bond->refs))
+		return false;
+
+	list_for_each_entry(tmp, &domain->mm_list, domain_head) {
+		if (tmp->io_mm == bond->io_mm && tmp->dev != bond->dev) {
+			detach_domain = false;
+			break;
+		}
+	}
+
+	domain->ops->mm_detach(domain, bond->dev, bond->io_mm, detach_domain);
+
+	list_del(&bond->mm_head);
+	list_del(&bond->domain_head);
+	list_del(&bond->dev_head);
+	io_mm_put_locked(bond->io_mm);
+
+	kfree(bond);
+
+	return true;
+}
+
+static void io_mm_detach_all_locked(struct iommu_bond *bond)
+{
+	while (!io_mm_detach_locked(bond));
+}
+
 /**
  * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
  * @dev: the device
@@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
 int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
 			  unsigned long flags, void *drvdata)
 {
+	int i, ret;
+	struct io_mm *io_mm = NULL;
 	struct iommu_domain *domain;
+	struct iommu_bond *bond = NULL, *tmp;
 	struct iommu_param *dev_param = dev->iommu_param;
 
 	domain = iommu_get_domain_for_dev(dev);
@@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
 	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
 		return -EINVAL;
 
-	return -ENOSYS; /* TODO */
+	/* If an io_mm already exists, use it */
+	spin_lock(&iommu_sva_lock);
+	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
+		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
+			continue;
+
+		/* Is it already bound to this device? */
+		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
+			if (tmp->dev != dev)
+				continue;
+
+			bond = tmp;
+			refcount_inc(&bond->refs);
+			io_mm_put_locked(io_mm);
+			break;
+		}
+		break;
+	}
+	spin_unlock(&iommu_sva_lock);
+
+	if (bond)
+		return 0;
+
+	if (!io_mm) {
+		io_mm = io_mm_alloc(domain, dev, mm);
+		if (IS_ERR(io_mm))
+			return PTR_ERR(io_mm);
+	}
+
+	ret = io_mm_attach(domain, dev, io_mm, drvdata);
+	if (ret)
+		io_mm_put(io_mm);
+	else
+		*pasid = io_mm->pasid;
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
 
@@ -165,7 +513,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
  */
 int iommu_sva_unbind_device(struct device *dev, int pasid)
 {
+	int ret = -ESRCH;
+	struct io_mm *io_mm;
 	struct iommu_domain *domain;
+	struct iommu_bond *bond = NULL;
 
 	domain = iommu_get_domain_for_dev(dev);
 	if (WARN_ON(!domain))
@@ -177,7 +528,23 @@ int iommu_sva_unbind_device(struct device *dev, int pasid)
 	 */
 	iommu_fault_queue_flush(dev);
 
-	return -ENOSYS; /* TODO */
+	spin_lock(&iommu_sva_lock);
+	io_mm = idr_find(&iommu_pasid_idr, pasid);
+	if (!io_mm) {
+		spin_unlock(&iommu_sva_lock);
+		return -ESRCH;
+	}
+
+	list_for_each_entry(bond, &io_mm->devices, mm_head) {
+		if (bond->dev == dev) {
+			io_mm_detach_locked(bond);
+			ret = 0;
+			break;
+		}
+	}
+	spin_unlock(&iommu_sva_lock);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
 
@@ -188,8 +555,17 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
  */
 void __iommu_sva_unbind_dev_all(struct device *dev)
 {
+	struct iommu_bond *bond, *next;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param)
+		return;
+
 	iommu_fault_queue_flush(dev);
 
-	/* TODO */
+	spin_lock(&iommu_sva_lock);
+	list_for_each_entry_safe(bond, next, &dev_param->mm_list, dev_head)
+		io_mm_detach_all_locked(bond);
+	spin_unlock(&iommu_sva_lock);
 }
 EXPORT_SYMBOL_GPL(__iommu_sva_unbind_dev_all);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f977851c522b..1d60b32a6744 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -586,6 +586,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 		ret = -ENOMEM;
 		goto err_free_name;
 	}
+	INIT_LIST_HEAD(&dev->iommu_param->mm_list);
 
 	kobject_get(group->devices_kobj);
 
@@ -1325,6 +1326,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 	domain->type = type;
 	/* Assume all sizes by default; the driver may override this later */
 	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
+	INIT_LIST_HEAD(&domain->mm_list);
 
 	return domain;
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 1fb10d64b9e5..09d85f44142a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -103,6 +103,18 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+
+	struct list_head mm_list;
+};
+
+struct io_mm {
+	int			pasid;
+	struct list_head	devices;
+	struct kref		kref;
+	struct mm_struct	*mm;
+
+	/* Release callback for this mm */
+	void (*release)(struct io_mm *io_mm);
 };
 
 enum iommu_cap {
@@ -204,6 +216,11 @@ struct page_response_msg {
  * @detach_dev: detach device from an iommu domain
  * @sva_device_init: initialize Shared Virtual Adressing for a device
  * @sva_device_shutdown: shutdown Shared Virtual Adressing for a device
+ * @mm_alloc: allocate io_mm
+ * @mm_free: free io_mm
+ * @mm_attach: attach io_mm to a device. Install PASID entry if necessary
+ * @mm_detach: detach io_mm from a device. Remove PASID entry and
+ *             flush associated TLB entries.
  * @map: map a physically contiguous memory region to an iommu domain
  * @unmap: unmap a physically contiguous memory region from an iommu domain
  * @map_sg: map a scatter-gather list of physically contiguous memory chunks
@@ -241,6 +258,13 @@ struct iommu_ops {
 			       unsigned int *min_pasid,
 			       unsigned int *max_pasid);
 	void (*sva_device_shutdown)(struct device *dev);
+	struct io_mm *(*mm_alloc)(struct iommu_domain *domain,
+				  struct mm_struct *mm);
+	void (*mm_free)(struct io_mm *io_mm);
+	int (*mm_attach)(struct iommu_domain *domain, struct device *dev,
+			 struct io_mm *io_mm, bool attach_domain);
+	void (*mm_detach)(struct iommu_domain *domain, struct device *dev,
+			  struct io_mm *io_mm, bool detach_domain);
 	int (*map)(struct iommu_domain *domain, unsigned long iova,
 		   phys_addr_t paddr, size_t size, int prot);
 	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
@@ -399,6 +423,7 @@ struct iommu_param {
 	unsigned long sva_features;
 	unsigned int min_pasid;
 	unsigned int max_pasid;
+	struct list_head mm_list;
 };
 
 int  iommu_device_register(struct iommu_device *iommu);
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-02-12 18:33     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-02-12 18:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-pci, linux-acpi, devicetree, iommu, kvm
  Cc: joro, robh+dt, mark.rutland, catalin.marinas, will.deacon,
	lorenzo.pieralisi, hanjun.guo, sudeep.holla, rjw, lenb,
	robin.murphy, bhelgaas, alex.williamson, tn, liubo95,
	thunder.leizhen, xieyisheng1, xuzaibo, ilias.apalodimas,
	jonathan.cameron, shunyong.yang, nwatters, okaya, jcrouse,
	rfranz, dwmw2, jacob.jun.pan, yi.l.liu, ashok.raj, robdclark,
	christian.koenig, bharatku

Introduce boilerplate code for allocating IOMMU mm structures and binding
them to devices. Four operations are added to IOMMU drivers:

* mm_alloc(): to create an io_mm structure and perform architecture-
  specific operations required to grab the process (for instance on ARM,
  pin down the CPU ASID so that the process doesn't get assigned a new
  ASID on rollover).

  There is a single valid io_mm structure per Linux mm. Future extensions
  may also use io_mm for kernel-managed address spaces, populated with
  map()/unmap() calls instead of bound to process address spaces. This
  patch focuses on "shared" io_mm.

* mm_attach(): attach an mm to a device. The IOMMU driver checks that the
  device is capable of sharing an address space, and writes the PASID
  table entry to install the pgd.

  Some IOMMU drivers will have a single PASID table per domain, for
  convenience. Other can implement it differently but to help these
  drivers, mm_attach and mm_detach take 'attach_domain' and
  'detach_domain' parameters, that tell whether they need to set and clear
  the PASID entry or only send the required TLB invalidations.

* mm_detach(): detach an mm from a device. The IOMMU driver removes the
  PASID table entry and invalidates the IOTLBs.

* mm_free(): free a structure allocated by mm_alloc(), and let arch
  release the process.

mm_attach and mm_detach operations are serialized with a spinlock. At the
moment it is global, but if we try to optimize it, the core should at
least prevent concurrent attach()/detach() on the same domain (so
multi-level PASID table code can allocate tables lazily). mm_alloc() can
sleep, but mm_free must not (because we'll have to call it from call_srcu
later on.)

At the moment we use an IDR for allocating PASIDs and retrieving contexts.
We also use a single spinlock. These can be refined and optimized later (a
custom allocator will be needed for top-down PASID allocation).

Keeping track of address spaces requires the use of MMU notifiers.
Handling process exit with regard to unbind() is tricky, so it is left for
another patch and we explicitly fail mm_alloc() for the moment.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/iommu/iommu-sva.c | 382 +++++++++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/iommu.c     |   2 +
 include/linux/iommu.h     |  25 +++
 3 files changed, 406 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 593685d891bf..f9af9d66b3ed 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -7,11 +7,321 @@
  * SPDX-License-Identifier: GPL-2.0
  */
 
+#include <linux/idr.h>
 #include <linux/iommu.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/**
+ * DOC: io_mm model
+ *
+ * The io_mm keeps track of process address spaces shared between CPU and IOMMU.
+ * The following example illustrates the relation between structures
+ * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and
+ * device. A device can have multiple io_mm and an io_mm may be bound to
+ * multiple devices.
+ *              ___________________________
+ *             |  IOMMU domain A           |
+ *             |  ________________         |
+ *             | |  IOMMU group   |        +------- io_pgtables
+ *             | |                |        |
+ *             | |   dev 00:00.0 ----+------- bond --- io_mm X
+ *             | |________________|   \    |
+ *             |                       '----- bond ---.
+ *             |___________________________|           \
+ *              ___________________________             \
+ *             |  IOMMU domain B           |           io_mm Y
+ *             |  ________________         |           / /
+ *             | |  IOMMU group   |        |          / /
+ *             | |                |        |         / /
+ *             | |   dev 00:01.0 ------------ bond -' /
+ *             | |   dev 00:01.1 ------------ bond --'
+ *             | |________________|        |
+ *             |                           +------- io_pgtables
+ *             |___________________________|
+ *
+ * In this example, device 00:00.0 is in domain A, devices 00:01.* are in domain
+ * B. All devices within the same domain access the same address spaces. Device
+ * 00:00.0 accesses address spaces X and Y, each corresponding to an mm_struct.
+ * Devices 00:01.* only access address space Y. In addition each
+ * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is
+ * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU.
+ *
+ * To obtain the above configuration, users would for instance issue the
+ * following calls:
+ *
+ *     iommu_sva_bind_device(dev 00:00.0, mm X, ...) -> PASID 1
+ *     iommu_sva_bind_device(dev 00:00.0, mm Y, ...) -> PASID 2
+ *     iommu_sva_bind_device(dev 00:01.0, mm Y, ...) -> PASID 2
+ *     iommu_sva_bind_device(dev 00:01.1, mm Y, ...) -> PASID 2
+ *
+ * A single Process Address Space ID (PASID) is allocated for each mm. In the
+ * example, devices use PASID 1 to read/write into address space X and PASID 2
+ * to read/write into address space Y.
+ *
+ * Hardware tables describing this configuration in the IOMMU would typically
+ * look like this:
+ *
+ *                                PASID tables
+ *                                 of domain A
+ *                              .->+--------+
+ *                             / 0 |        |-------> io_pgtable
+ *                            /    +--------+
+ *            Device tables  /   1 |        |-------> pgd X
+ *              +--------+  /      +--------+
+ *      00:00.0 |      A |-'     2 |        |--.
+ *              +--------+         +--------+   \
+ *              :        :       3 |        |    \
+ *              +--------+         +--------+     --> pgd Y
+ *      00:01.0 |      B |--.                    /
+ *              +--------+   \                  |
+ *      00:01.1 |      B |----+   PASID tables  |
+ *              +--------+     \   of domain B  |
+ *                              '->+--------+   |
+ *                               0 |        |-- | --> io_pgtable
+ *                                 +--------+   |
+ *                               1 |        |   |
+ *                                 +--------+   |
+ *                               2 |        |---'
+ *                                 +--------+
+ *                               3 |        |
+ *                                 +--------+
+ *
+ * With this model, a single call binds all devices in a given domain to an
+ * address space. Other devices in the domain will get the same bond implicitly.
+ * However, users must issue one bind() for each device, because IOMMUs may
+ * implement SVA differently. Furthermore, mandating one bind() per device
+ * allows the driver to perform sanity-checks on device capabilities.
+ *
+ * On Arm and AMD IOMMUs, entry 0 of the PASID table can be used to hold
+ * non-PASID translations. In this case PASID 0 is reserved and entry 0 points
+ * to the io_pgtable base. On Intel IOMMU, the io_pgtable base would be held in
+ * the device table and PASID 0 would be available to the allocator.
+ */
 
 /* TODO: stub for the fault queue. Remove later. */
 #define iommu_fault_queue_flush(...)
 
+struct iommu_bond {
+	struct io_mm		*io_mm;
+	struct device		*dev;
+	struct iommu_domain	*domain;
+
+	struct list_head	mm_head;
+	struct list_head	dev_head;
+	struct list_head	domain_head;
+
+	void			*drvdata;
+
+	/* Number of bind() calls */
+	refcount_t		refs;
+};
+
+/*
+ * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is
+ * used for returning errors). In practice implementations will use at most 20
+ * bits, which is the PCI limit.
+ */
+static DEFINE_IDR(iommu_pasid_idr);
+
+/*
+ * For the moment this is an all-purpose lock. It serializes
+ * access/modifications to bonds, access/modifications to the PASID IDR, and
+ * changes to io_mm refcount as well.
+ */
+static DEFINE_SPINLOCK(iommu_sva_lock);
+
+static struct io_mm *
+io_mm_alloc(struct iommu_domain *domain, struct device *dev,
+	    struct mm_struct *mm)
+{
+	int ret;
+	int pasid;
+	struct io_mm *io_mm;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param || !domain->ops->mm_alloc || !domain->ops->mm_free)
+		return ERR_PTR(-ENODEV);
+
+	io_mm = domain->ops->mm_alloc(domain, mm);
+	if (IS_ERR(io_mm))
+		return io_mm;
+	if (!io_mm)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * The mm must not be freed until after the driver frees the io_mm
+	 * (which may involve unpinning the CPU ASID for instance, requiring a
+	 * valid mm struct.)
+	 */
+	mmgrab(mm);
+
+	io_mm->mm		= mm;
+	io_mm->release		= domain->ops->mm_free;
+	INIT_LIST_HEAD(&io_mm->devices);
+
+	idr_preload(GFP_KERNEL);
+	spin_lock(&iommu_sva_lock);
+	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
+				 dev_param->max_pasid + 1, GFP_ATOMIC);
+	io_mm->pasid = pasid;
+	spin_unlock(&iommu_sva_lock);
+	idr_preload_end();
+
+	if (pasid < 0) {
+		ret = pasid;
+		goto err_free_mm;
+	}
+
+	/* TODO: keep track of mm. For the moment, abort. */
+	ret = -ENOSYS;
+	spin_lock(&iommu_sva_lock);
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	spin_unlock(&iommu_sva_lock);
+
+err_free_mm:
+	domain->ops->mm_free(io_mm);
+	mmdrop(mm);
+
+	return ERR_PTR(ret);
+}
+
+static void io_mm_free(struct io_mm *io_mm)
+{
+	struct mm_struct *mm;
+	void (*release)(struct io_mm *);
+
+	release = io_mm->release;
+	mm = io_mm->mm;
+
+	release(io_mm);
+	mmdrop(mm);
+}
+
+static void io_mm_release(struct kref *kref)
+{
+	struct io_mm *io_mm;
+
+	io_mm = container_of(kref, struct io_mm, kref);
+	WARN_ON(!list_empty(&io_mm->devices));
+
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+
+	io_mm_free(io_mm);
+}
+
+/*
+ * Returns non-zero if a reference to the io_mm was successfully taken.
+ * Returns zero if the io_mm is being freed and should not be used.
+ */
+static int io_mm_get_locked(struct io_mm *io_mm)
+{
+	if (io_mm)
+		return kref_get_unless_zero(&io_mm->kref);
+
+	return 0;
+}
+
+static void io_mm_put_locked(struct io_mm *io_mm)
+{
+	kref_put(&io_mm->kref, io_mm_release);
+}
+
+static void io_mm_put(struct io_mm *io_mm)
+{
+	spin_lock(&iommu_sva_lock);
+	io_mm_put_locked(io_mm);
+	spin_unlock(&iommu_sva_lock);
+}
+
+static int io_mm_attach(struct iommu_domain *domain, struct device *dev,
+			struct io_mm *io_mm, void *drvdata)
+{
+	int ret;
+	bool attach_domain = true;
+	int pasid = io_mm->pasid;
+	struct iommu_bond *bond, *tmp;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param)
+		return -EINVAL;
+
+	if (!domain->ops->mm_attach || !domain->ops->mm_detach)
+		return -ENODEV;
+
+	if (pasid > dev_param->max_pasid || pasid < dev_param->min_pasid)
+		return -ERANGE;
+
+	bond = kzalloc(sizeof(*bond), GFP_KERNEL);
+	if (!bond)
+		return -ENOMEM;
+
+	bond->domain		= domain;
+	bond->io_mm		= io_mm;
+	bond->dev		= dev;
+	bond->drvdata		= drvdata;
+	refcount_set(&bond->refs, 1);
+
+	spin_lock(&iommu_sva_lock);
+	/*
+	 * Check if this io_mm is already bound to the domain. In which case the
+	 * IOMMU driver doesn't have to install the PASID table entry.
+	 */
+	list_for_each_entry(tmp, &domain->mm_list, domain_head) {
+		if (tmp->io_mm == io_mm) {
+			attach_domain = false;
+			break;
+		}
+	}
+
+	ret = domain->ops->mm_attach(domain, dev, io_mm, attach_domain);
+	if (ret) {
+		kfree(bond);
+		spin_unlock(&iommu_sva_lock);
+		return ret;
+	}
+
+	list_add(&bond->mm_head, &io_mm->devices);
+	list_add(&bond->domain_head, &domain->mm_list);
+	list_add(&bond->dev_head, &dev_param->mm_list);
+	spin_unlock(&iommu_sva_lock);
+
+	return 0;
+}
+
+static bool io_mm_detach_locked(struct iommu_bond *bond)
+{
+	struct iommu_bond *tmp;
+	bool detach_domain = true;
+	struct iommu_domain *domain = bond->domain;
+
+	if (!refcount_dec_and_test(&bond->refs))
+		return false;
+
+	list_for_each_entry(tmp, &domain->mm_list, domain_head) {
+		if (tmp->io_mm == bond->io_mm && tmp->dev != bond->dev) {
+			detach_domain = false;
+			break;
+		}
+	}
+
+	domain->ops->mm_detach(domain, bond->dev, bond->io_mm, detach_domain);
+
+	list_del(&bond->mm_head);
+	list_del(&bond->domain_head);
+	list_del(&bond->dev_head);
+	io_mm_put_locked(bond->io_mm);
+
+	kfree(bond);
+
+	return true;
+}
+
+static void io_mm_detach_all_locked(struct iommu_bond *bond)
+{
+	while (!io_mm_detach_locked(bond));
+}
+
 /**
  * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
  * @dev: the device
@@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
 int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
 			  unsigned long flags, void *drvdata)
 {
+	int i, ret;
+	struct io_mm *io_mm = NULL;
 	struct iommu_domain *domain;
+	struct iommu_bond *bond = NULL, *tmp;
 	struct iommu_param *dev_param = dev->iommu_param;
 
 	domain = iommu_get_domain_for_dev(dev);
@@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
 	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
 		return -EINVAL;
 
-	return -ENOSYS; /* TODO */
+	/* If an io_mm already exists, use it */
+	spin_lock(&iommu_sva_lock);
+	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
+		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
+			continue;
+
+		/* Is it already bound to this device? */
+		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
+			if (tmp->dev != dev)
+				continue;
+
+			bond = tmp;
+			refcount_inc(&bond->refs);
+			io_mm_put_locked(io_mm);
+			break;
+		}
+		break;
+	}
+	spin_unlock(&iommu_sva_lock);
+
+	if (bond)
+		return 0;
+
+	if (!io_mm) {
+		io_mm = io_mm_alloc(domain, dev, mm);
+		if (IS_ERR(io_mm))
+			return PTR_ERR(io_mm);
+	}
+
+	ret = io_mm_attach(domain, dev, io_mm, drvdata);
+	if (ret)
+		io_mm_put(io_mm);
+	else
+		*pasid = io_mm->pasid;
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
 
@@ -165,7 +513,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
  */
 int iommu_sva_unbind_device(struct device *dev, int pasid)
 {
+	int ret = -ESRCH;
+	struct io_mm *io_mm;
 	struct iommu_domain *domain;
+	struct iommu_bond *bond = NULL;
 
 	domain = iommu_get_domain_for_dev(dev);
 	if (WARN_ON(!domain))
@@ -177,7 +528,23 @@ int iommu_sva_unbind_device(struct device *dev, int pasid)
 	 */
 	iommu_fault_queue_flush(dev);
 
-	return -ENOSYS; /* TODO */
+	spin_lock(&iommu_sva_lock);
+	io_mm = idr_find(&iommu_pasid_idr, pasid);
+	if (!io_mm) {
+		spin_unlock(&iommu_sva_lock);
+		return -ESRCH;
+	}
+
+	list_for_each_entry(bond, &io_mm->devices, mm_head) {
+		if (bond->dev == dev) {
+			io_mm_detach_locked(bond);
+			ret = 0;
+			break;
+		}
+	}
+	spin_unlock(&iommu_sva_lock);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
 
@@ -188,8 +555,17 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
  */
 void __iommu_sva_unbind_dev_all(struct device *dev)
 {
+	struct iommu_bond *bond, *next;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param)
+		return;
+
 	iommu_fault_queue_flush(dev);
 
-	/* TODO */
+	spin_lock(&iommu_sva_lock);
+	list_for_each_entry_safe(bond, next, &dev_param->mm_list, dev_head)
+		io_mm_detach_all_locked(bond);
+	spin_unlock(&iommu_sva_lock);
 }
 EXPORT_SYMBOL_GPL(__iommu_sva_unbind_dev_all);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f977851c522b..1d60b32a6744 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -586,6 +586,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 		ret = -ENOMEM;
 		goto err_free_name;
 	}
+	INIT_LIST_HEAD(&dev->iommu_param->mm_list);
 
 	kobject_get(group->devices_kobj);
 
@@ -1325,6 +1326,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 	domain->type = type;
 	/* Assume all sizes by default; the driver may override this later */
 	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
+	INIT_LIST_HEAD(&domain->mm_list);
 
 	return domain;
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 1fb10d64b9e5..09d85f44142a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -103,6 +103,18 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+
+	struct list_head mm_list;
+};
+
+struct io_mm {
+	int			pasid;
+	struct list_head	devices;
+	struct kref		kref;
+	struct mm_struct	*mm;
+
+	/* Release callback for this mm */
+	void (*release)(struct io_mm *io_mm);
 };
 
 enum iommu_cap {
@@ -204,6 +216,11 @@ struct page_response_msg {
  * @detach_dev: detach device from an iommu domain
  * @sva_device_init: initialize Shared Virtual Adressing for a device
  * @sva_device_shutdown: shutdown Shared Virtual Adressing for a device
+ * @mm_alloc: allocate io_mm
+ * @mm_free: free io_mm
+ * @mm_attach: attach io_mm to a device. Install PASID entry if necessary
+ * @mm_detach: detach io_mm from a device. Remove PASID entry and
+ *             flush associated TLB entries.
  * @map: map a physically contiguous memory region to an iommu domain
  * @unmap: unmap a physically contiguous memory region from an iommu domain
  * @map_sg: map a scatter-gather list of physically contiguous memory chunks
@@ -241,6 +258,13 @@ struct iommu_ops {
 			       unsigned int *min_pasid,
 			       unsigned int *max_pasid);
 	void (*sva_device_shutdown)(struct device *dev);
+	struct io_mm *(*mm_alloc)(struct iommu_domain *domain,
+				  struct mm_struct *mm);
+	void (*mm_free)(struct io_mm *io_mm);
+	int (*mm_attach)(struct iommu_domain *domain, struct device *dev,
+			 struct io_mm *io_mm, bool attach_domain);
+	void (*mm_detach)(struct iommu_domain *domain, struct device *dev,
+			  struct io_mm *io_mm, bool detach_domain);
 	int (*map)(struct iommu_domain *domain, unsigned long iova,
 		   phys_addr_t paddr, size_t size, int prot);
 	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
@@ -399,6 +423,7 @@ struct iommu_param {
 	unsigned long sva_features;
 	unsigned int min_pasid;
 	unsigned int max_pasid;
+	struct list_head mm_list;
 };
 
 int  iommu_device_register(struct iommu_device *iommu);
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 03/37] iommu/sva: Manage process address spaces
@ 2018-02-12 18:33     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 62+ messages in thread
From: Jean-Philippe Brucker @ 2018-02-12 18:33 UTC (permalink / raw)
  To: linux-arm-kernel

Introduce boilerplate code for allocating IOMMU mm structures and binding
them to devices. Four operations are added to IOMMU drivers:

* mm_alloc(): to create an io_mm structure and perform architecture-
  specific operations required to grab the process (for instance on ARM,
  pin down the CPU ASID so that the process doesn't get assigned a new
  ASID on rollover).

  There is a single valid io_mm structure per Linux mm. Future extensions
  may also use io_mm for kernel-managed address spaces, populated with
  map()/unmap() calls instead of bound to process address spaces. This
  patch focuses on "shared" io_mm.

* mm_attach(): attach an mm to a device. The IOMMU driver checks that the
  device is capable of sharing an address space, and writes the PASID
  table entry to install the pgd.

  Some IOMMU drivers will have a single PASID table per domain, for
  convenience. Other can implement it differently but to help these
  drivers, mm_attach and mm_detach take 'attach_domain' and
  'detach_domain' parameters, that tell whether they need to set and clear
  the PASID entry or only send the required TLB invalidations.

* mm_detach(): detach an mm from a device. The IOMMU driver removes the
  PASID table entry and invalidates the IOTLBs.

* mm_free(): free a structure allocated by mm_alloc(), and let arch
  release the process.

mm_attach and mm_detach operations are serialized with a spinlock. At the
moment it is global, but if we try to optimize it, the core should at
least prevent concurrent attach()/detach() on the same domain (so
multi-level PASID table code can allocate tables lazily). mm_alloc() can
sleep, but mm_free must not (because we'll have to call it from call_srcu
later on.)

At the moment we use an IDR for allocating PASIDs and retrieving contexts.
We also use a single spinlock. These can be refined and optimized later (a
custom allocator will be needed for top-down PASID allocation).

Keeping track of address spaces requires the use of MMU notifiers.
Handling process exit with regard to unbind() is tricky, so it is left for
another patch and we explicitly fail mm_alloc() for the moment.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/iommu/iommu-sva.c | 382 +++++++++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/iommu.c     |   2 +
 include/linux/iommu.h     |  25 +++
 3 files changed, 406 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 593685d891bf..f9af9d66b3ed 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -7,11 +7,321 @@
  * SPDX-License-Identifier: GPL-2.0
  */
 
+#include <linux/idr.h>
 #include <linux/iommu.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+/**
+ * DOC: io_mm model
+ *
+ * The io_mm keeps track of process address spaces shared between CPU and IOMMU.
+ * The following example illustrates the relation between structures
+ * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and
+ * device. A device can have multiple io_mm and an io_mm may be bound to
+ * multiple devices.
+ *              ___________________________
+ *             |  IOMMU domain A           |
+ *             |  ________________         |
+ *             | |  IOMMU group   |        +------- io_pgtables
+ *             | |                |        |
+ *             | |   dev 00:00.0 ----+------- bond --- io_mm X
+ *             | |________________|   \    |
+ *             |                       '----- bond ---.
+ *             |___________________________|           \
+ *              ___________________________             \
+ *             |  IOMMU domain B           |           io_mm Y
+ *             |  ________________         |           / /
+ *             | |  IOMMU group   |        |          / /
+ *             | |                |        |         / /
+ *             | |   dev 00:01.0 ------------ bond -' /
+ *             | |   dev 00:01.1 ------------ bond --'
+ *             | |________________|        |
+ *             |                           +------- io_pgtables
+ *             |___________________________|
+ *
+ * In this example, device 00:00.0 is in domain A, devices 00:01.* are in domain
+ * B. All devices within the same domain access the same address spaces. Device
+ * 00:00.0 accesses address spaces X and Y, each corresponding to an mm_struct.
+ * Devices 00:01.* only access address space Y. In addition each
+ * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is
+ * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU.
+ *
+ * To obtain the above configuration, users would for instance issue the
+ * following calls:
+ *
+ *     iommu_sva_bind_device(dev 00:00.0, mm X, ...) -> PASID 1
+ *     iommu_sva_bind_device(dev 00:00.0, mm Y, ...) -> PASID 2
+ *     iommu_sva_bind_device(dev 00:01.0, mm Y, ...) -> PASID 2
+ *     iommu_sva_bind_device(dev 00:01.1, mm Y, ...) -> PASID 2
+ *
+ * A single Process Address Space ID (PASID) is allocated for each mm. In the
+ * example, devices use PASID 1 to read/write into address space X and PASID 2
+ * to read/write into address space Y.
+ *
+ * Hardware tables describing this configuration in the IOMMU would typically
+ * look like this:
+ *
+ *                                PASID tables
+ *                                 of domain A
+ *                              .->+--------+
+ *                             / 0 |        |-------> io_pgtable
+ *                            /    +--------+
+ *            Device tables  /   1 |        |-------> pgd X
+ *              +--------+  /      +--------+
+ *      00:00.0 |      A |-'     2 |        |--.
+ *              +--------+         +--------+   \
+ *              :        :       3 |        |    \
+ *              +--------+         +--------+     --> pgd Y
+ *      00:01.0 |      B |--.                    /
+ *              +--------+   \                  |
+ *      00:01.1 |      B |----+   PASID tables  |
+ *              +--------+     \   of domain B  |
+ *                              '->+--------+   |
+ *                               0 |        |-- | --> io_pgtable
+ *                                 +--------+   |
+ *                               1 |        |   |
+ *                                 +--------+   |
+ *                               2 |        |---'
+ *                                 +--------+
+ *                               3 |        |
+ *                                 +--------+
+ *
+ * With this model, a single call binds all devices in a given domain to an
+ * address space. Other devices in the domain will get the same bond implicitly.
+ * However, users must issue one bind() for each device, because IOMMUs may
+ * implement SVA differently. Furthermore, mandating one bind() per device
+ * allows the driver to perform sanity-checks on device capabilities.
+ *
+ * On Arm and AMD IOMMUs, entry 0 of the PASID table can be used to hold
+ * non-PASID translations. In this case PASID 0 is reserved and entry 0 points
+ * to the io_pgtable base. On Intel IOMMU, the io_pgtable base would be held in
+ * the device table and PASID 0 would be available to the allocator.
+ */
 
 /* TODO: stub for the fault queue. Remove later. */
 #define iommu_fault_queue_flush(...)
 
+struct iommu_bond {
+	struct io_mm		*io_mm;
+	struct device		*dev;
+	struct iommu_domain	*domain;
+
+	struct list_head	mm_head;
+	struct list_head	dev_head;
+	struct list_head	domain_head;
+
+	void			*drvdata;
+
+	/* Number of bind() calls */
+	refcount_t		refs;
+};
+
+/*
+ * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is
+ * used for returning errors). In practice implementations will use at most 20
+ * bits, which is the PCI limit.
+ */
+static DEFINE_IDR(iommu_pasid_idr);
+
+/*
+ * For the moment this is an all-purpose lock. It serializes
+ * access/modifications to bonds, access/modifications to the PASID IDR, and
+ * changes to io_mm refcount as well.
+ */
+static DEFINE_SPINLOCK(iommu_sva_lock);
+
+static struct io_mm *
+io_mm_alloc(struct iommu_domain *domain, struct device *dev,
+	    struct mm_struct *mm)
+{
+	int ret;
+	int pasid;
+	struct io_mm *io_mm;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param || !domain->ops->mm_alloc || !domain->ops->mm_free)
+		return ERR_PTR(-ENODEV);
+
+	io_mm = domain->ops->mm_alloc(domain, mm);
+	if (IS_ERR(io_mm))
+		return io_mm;
+	if (!io_mm)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * The mm must not be freed until after the driver frees the io_mm
+	 * (which may involve unpinning the CPU ASID for instance, requiring a
+	 * valid mm struct.)
+	 */
+	mmgrab(mm);
+
+	io_mm->mm		= mm;
+	io_mm->release		= domain->ops->mm_free;
+	INIT_LIST_HEAD(&io_mm->devices);
+
+	idr_preload(GFP_KERNEL);
+	spin_lock(&iommu_sva_lock);
+	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, dev_param->min_pasid,
+				 dev_param->max_pasid + 1, GFP_ATOMIC);
+	io_mm->pasid = pasid;
+	spin_unlock(&iommu_sva_lock);
+	idr_preload_end();
+
+	if (pasid < 0) {
+		ret = pasid;
+		goto err_free_mm;
+	}
+
+	/* TODO: keep track of mm. For the moment, abort. */
+	ret = -ENOSYS;
+	spin_lock(&iommu_sva_lock);
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	spin_unlock(&iommu_sva_lock);
+
+err_free_mm:
+	domain->ops->mm_free(io_mm);
+	mmdrop(mm);
+
+	return ERR_PTR(ret);
+}
+
+static void io_mm_free(struct io_mm *io_mm)
+{
+	struct mm_struct *mm;
+	void (*release)(struct io_mm *);
+
+	release = io_mm->release;
+	mm = io_mm->mm;
+
+	release(io_mm);
+	mmdrop(mm);
+}
+
+static void io_mm_release(struct kref *kref)
+{
+	struct io_mm *io_mm;
+
+	io_mm = container_of(kref, struct io_mm, kref);
+	WARN_ON(!list_empty(&io_mm->devices));
+
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+
+	io_mm_free(io_mm);
+}
+
+/*
+ * Returns non-zero if a reference to the io_mm was successfully taken.
+ * Returns zero if the io_mm is being freed and should not be used.
+ */
+static int io_mm_get_locked(struct io_mm *io_mm)
+{
+	if (io_mm)
+		return kref_get_unless_zero(&io_mm->kref);
+
+	return 0;
+}
+
+static void io_mm_put_locked(struct io_mm *io_mm)
+{
+	kref_put(&io_mm->kref, io_mm_release);
+}
+
+static void io_mm_put(struct io_mm *io_mm)
+{
+	spin_lock(&iommu_sva_lock);
+	io_mm_put_locked(io_mm);
+	spin_unlock(&iommu_sva_lock);
+}
+
+static int io_mm_attach(struct iommu_domain *domain, struct device *dev,
+			struct io_mm *io_mm, void *drvdata)
+{
+	int ret;
+	bool attach_domain = true;
+	int pasid = io_mm->pasid;
+	struct iommu_bond *bond, *tmp;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param)
+		return -EINVAL;
+
+	if (!domain->ops->mm_attach || !domain->ops->mm_detach)
+		return -ENODEV;
+
+	if (pasid > dev_param->max_pasid || pasid < dev_param->min_pasid)
+		return -ERANGE;
+
+	bond = kzalloc(sizeof(*bond), GFP_KERNEL);
+	if (!bond)
+		return -ENOMEM;
+
+	bond->domain		= domain;
+	bond->io_mm		= io_mm;
+	bond->dev		= dev;
+	bond->drvdata		= drvdata;
+	refcount_set(&bond->refs, 1);
+
+	spin_lock(&iommu_sva_lock);
+	/*
+	 * Check if this io_mm is already bound to the domain. In which case the
+	 * IOMMU driver doesn't have to install the PASID table entry.
+	 */
+	list_for_each_entry(tmp, &domain->mm_list, domain_head) {
+		if (tmp->io_mm == io_mm) {
+			attach_domain = false;
+			break;
+		}
+	}
+
+	ret = domain->ops->mm_attach(domain, dev, io_mm, attach_domain);
+	if (ret) {
+		kfree(bond);
+		spin_unlock(&iommu_sva_lock);
+		return ret;
+	}
+
+	list_add(&bond->mm_head, &io_mm->devices);
+	list_add(&bond->domain_head, &domain->mm_list);
+	list_add(&bond->dev_head, &dev_param->mm_list);
+	spin_unlock(&iommu_sva_lock);
+
+	return 0;
+}
+
+static bool io_mm_detach_locked(struct iommu_bond *bond)
+{
+	struct iommu_bond *tmp;
+	bool detach_domain = true;
+	struct iommu_domain *domain = bond->domain;
+
+	if (!refcount_dec_and_test(&bond->refs))
+		return false;
+
+	list_for_each_entry(tmp, &domain->mm_list, domain_head) {
+		if (tmp->io_mm == bond->io_mm && tmp->dev != bond->dev) {
+			detach_domain = false;
+			break;
+		}
+	}
+
+	domain->ops->mm_detach(domain, bond->dev, bond->io_mm, detach_domain);
+
+	list_del(&bond->mm_head);
+	list_del(&bond->domain_head);
+	list_del(&bond->dev_head);
+	io_mm_put_locked(bond->io_mm);
+
+	kfree(bond);
+
+	return true;
+}
+
+static void io_mm_detach_all_locked(struct iommu_bond *bond)
+{
+	while (!io_mm_detach_locked(bond));
+}
+
 /**
  * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a device
  * @dev: the device
@@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown);
 int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
 			  unsigned long flags, void *drvdata)
 {
+	int i, ret;
+	struct io_mm *io_mm = NULL;
 	struct iommu_domain *domain;
+	struct iommu_bond *bond = NULL, *tmp;
 	struct iommu_param *dev_param = dev->iommu_param;
 
 	domain = iommu_get_domain_for_dev(dev);
@@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
 	if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF))
 		return -EINVAL;
 
-	return -ENOSYS; /* TODO */
+	/* If an io_mm already exists, use it */
+	spin_lock(&iommu_sva_lock);
+	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
+		if (io_mm->mm != mm || !io_mm_get_locked(io_mm))
+			continue;
+
+		/* Is it already bound to this device? */
+		list_for_each_entry(tmp, &io_mm->devices, mm_head) {
+			if (tmp->dev != dev)
+				continue;
+
+			bond = tmp;
+			refcount_inc(&bond->refs);
+			io_mm_put_locked(io_mm);
+			break;
+		}
+		break;
+	}
+	spin_unlock(&iommu_sva_lock);
+
+	if (bond)
+		return 0;
+
+	if (!io_mm) {
+		io_mm = io_mm_alloc(domain, dev, mm);
+		if (IS_ERR(io_mm))
+			return PTR_ERR(io_mm);
+	}
+
+	ret = io_mm_attach(domain, dev, io_mm, drvdata);
+	if (ret)
+		io_mm_put(io_mm);
+	else
+		*pasid = io_mm->pasid;
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
 
@@ -165,7 +513,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
  */
 int iommu_sva_unbind_device(struct device *dev, int pasid)
 {
+	int ret = -ESRCH;
+	struct io_mm *io_mm;
 	struct iommu_domain *domain;
+	struct iommu_bond *bond = NULL;
 
 	domain = iommu_get_domain_for_dev(dev);
 	if (WARN_ON(!domain))
@@ -177,7 +528,23 @@ int iommu_sva_unbind_device(struct device *dev, int pasid)
 	 */
 	iommu_fault_queue_flush(dev);
 
-	return -ENOSYS; /* TODO */
+	spin_lock(&iommu_sva_lock);
+	io_mm = idr_find(&iommu_pasid_idr, pasid);
+	if (!io_mm) {
+		spin_unlock(&iommu_sva_lock);
+		return -ESRCH;
+	}
+
+	list_for_each_entry(bond, &io_mm->devices, mm_head) {
+		if (bond->dev == dev) {
+			io_mm_detach_locked(bond);
+			ret = 0;
+			break;
+		}
+	}
+	spin_unlock(&iommu_sva_lock);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
 
@@ -188,8 +555,17 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
  */
 void __iommu_sva_unbind_dev_all(struct device *dev)
 {
+	struct iommu_bond *bond, *next;
+	struct iommu_param *dev_param = dev->iommu_param;
+
+	if (!dev_param)
+		return;
+
 	iommu_fault_queue_flush(dev);
 
-	/* TODO */
+	spin_lock(&iommu_sva_lock);
+	list_for_each_entry_safe(bond, next, &dev_param->mm_list, dev_head)
+		io_mm_detach_all_locked(bond);
+	spin_unlock(&iommu_sva_lock);
 }
 EXPORT_SYMBOL_GPL(__iommu_sva_unbind_dev_all);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f977851c522b..1d60b32a6744 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -586,6 +586,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 		ret = -ENOMEM;
 		goto err_free_name;
 	}
+	INIT_LIST_HEAD(&dev->iommu_param->mm_list);
 
 	kobject_get(group->devices_kobj);
 
@@ -1325,6 +1326,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 	domain->type = type;
 	/* Assume all sizes by default; the driver may override this later */
 	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
+	INIT_LIST_HEAD(&domain->mm_list);
 
 	return domain;
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 1fb10d64b9e5..09d85f44142a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -103,6 +103,18 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+
+	struct list_head mm_list;
+};
+
+struct io_mm {
+	int			pasid;
+	struct list_head	devices;
+	struct kref		kref;
+	struct mm_struct	*mm;
+
+	/* Release callback for this mm */
+	void (*release)(struct io_mm *io_mm);
 };
 
 enum iommu_cap {
@@ -204,6 +216,11 @@ struct page_response_msg {
  * @detach_dev: detach device from an iommu domain
  * @sva_device_init: initialize Shared Virtual Adressing for a device
  * @sva_device_shutdown: shutdown Shared Virtual Adressing for a device
+ * @mm_alloc: allocate io_mm
+ * @mm_free: free io_mm
+ * @mm_attach: attach io_mm to a device. Install PASID entry if necessary
+ * @mm_detach: detach io_mm from a device. Remove PASID entry and
+ *             flush associated TLB entries.
  * @map: map a physically contiguous memory region to an iommu domain
  * @unmap: unmap a physically contiguous memory region from an iommu domain
  * @map_sg: map a scatter-gather list of physically contiguous memory chunks
@@ -241,6 +258,13 @@ struct iommu_ops {
 			       unsigned int *min_pasid,
 			       unsigned int *max_pasid);
 	void (*sva_device_shutdown)(struct device *dev);
+	struct io_mm *(*mm_alloc)(struct iommu_domain *domain,
+				  struct mm_struct *mm);
+	void (*mm_free)(struct io_mm *io_mm);
+	int (*mm_attach)(struct iommu_domain *domain, struct device *dev,
+			 struct io_mm *io_mm, bool attach_domain);
+	void (*mm_detach)(struct iommu_domain *domain, struct device *dev,
+			  struct io_mm *io_mm, bool detach_domain);
 	int (*map)(struct iommu_domain *domain, unsigned long iova,
 		   phys_addr_t paddr, size_t size, int prot);
 	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
@@ -399,6 +423,7 @@ struct iommu_param {
 	unsigned long sva_features;
 	unsigned int min_pasid;
 	unsigned int max_pasid;
+	struct list_head mm_list;
 };
 
 int  iommu_device_register(struct iommu_device *iommu);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2018-04-24 18:52 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1519280641-30258-1-git-send-email-xieyisheng1@huawei.com>
     [not found] ` <1519280641-30258-3-git-send-email-xieyisheng1@huawei.com>
     [not found]   ` <1519280641-30258-3-git-send-email-xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2018-02-22  9:03     ` [PATCH 03/37] iommu/sva: Manage process address spaces Yisheng Xie
2018-02-22  9:03       ` Yisheng Xie
2018-02-22  9:03       ` Yisheng Xie
     [not found]       ` <63e4b545-c6e5-f4b4-3ba1-cf7a142b64ec-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2018-02-22 13:17         ` Jean-Philippe Brucker
2018-02-22 13:17           ` Jean-Philippe Brucker
2018-02-22 13:17           ` Jean-Philippe Brucker
     [not found] ` <1519280641-30258-37-git-send-email-xieyisheng1@huawei.com>
     [not found]   ` <1519280641-30258-37-git-send-email-xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2018-03-19  9:47     ` [PATCH 37/37] vfio: Add support for Shared Virtual Addressing Yisheng Xie
2018-03-19  9:47       ` Yisheng Xie
2018-03-19  9:47       ` Yisheng Xie
     [not found]       ` <e3ae1693-0ed8-65e1-352a-07f79038dcbd-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2018-03-21 13:40         ` Jean-Philippe Brucker
2018-03-21 13:40           ` Jean-Philippe Brucker
2018-03-21 13:40           ` Jean-Philippe Brucker
     [not found] ` <1519280641-30258-27-git-send-email-xieyisheng1@huawei.com>
     [not found]   ` <1519280641-30258-27-git-send-email-xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2018-03-19 11:03     ` [PATCH 27/37] iommu/arm-smmu-v3: Register fault workqueue Yisheng Xie
2018-03-19 11:03       ` Yisheng Xie
2018-03-19 11:03       ` Yisheng Xie
     [not found]       ` <f2841aac-4c5e-4ac2-fa4d-81d6b2857503-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2018-03-21 13:24         ` Jean-Philippe Brucker
2018-03-21 13:24           ` Jean-Philippe Brucker
2018-03-21 13:24           ` Jean-Philippe Brucker
     [not found]           ` <cabd71fd-c54c-c21a-a5b7-227e69fa4286-5wv7dgnIgG8@public.gmane.org>
2018-03-22  1:09             ` Yisheng Xie
2018-03-22  1:09               ` Yisheng Xie
2018-03-22  1:09               ` Yisheng Xie
     [not found]               ` <c4f0a441-e975-395a-fc38-3686db21227d-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2018-04-04 10:13                 ` Jean-Philippe Brucker
2018-04-04 10:13                   ` Jean-Philippe Brucker
2018-04-04 10:13                   ` Jean-Philippe Brucker
2018-02-12 18:33 [PATCH 00/37] Shared Virtual Addressing for the IOMMU Jean-Philippe Brucker
     [not found] ` <20180212183352.22730-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-02-12 18:33   ` [PATCH 03/37] iommu/sva: Manage process address spaces Jean-Philippe Brucker
2018-02-12 18:33     ` Jean-Philippe Brucker
2018-02-12 18:33     ` Jean-Philippe Brucker
     [not found]     ` <20180212183352.22730-4-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-03-01  6:52       ` Lu Baolu
2018-03-01  6:52         ` Lu Baolu
2018-03-01  6:52         ` Lu Baolu
     [not found]         ` <5A97A324.9050605-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-03-01  8:04           ` Christian König
2018-03-01  8:04             ` Christian König
2018-03-01  8:04             ` Christian König
     [not found]             ` <cd4d7a98-e45e-7066-345f-52d8eef926a2-5C7GfCeVMHo@public.gmane.org>
2018-03-02 16:42               ` Jean-Philippe Brucker
2018-03-02 16:42                 ` Jean-Philippe Brucker
2018-03-02 16:42                 ` Jean-Philippe Brucker
2018-03-02 16:19           ` Jean-Philippe Brucker
2018-03-02 16:19             ` Jean-Philippe Brucker
2018-03-02 16:19             ` Jean-Philippe Brucker
2018-03-05 15:28       ` Sinan Kaya
2018-03-05 15:28         ` Sinan Kaya
2018-03-05 15:28         ` Sinan Kaya
     [not found]         ` <27a044ee-0ed7-0470-0fef-289d0d5cf5e8-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-03-06 10:37           ` Jean-Philippe Brucker
2018-03-06 10:37             ` Jean-Philippe Brucker
2018-03-06 10:37             ` Jean-Philippe Brucker
2018-04-24  1:32       ` Sinan Kaya
2018-04-24  1:32         ` Sinan Kaya
2018-04-24  1:32         ` Sinan Kaya
     [not found]         ` <57d77955-caa7-ddac-df7d-7eef1f05dbb2-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-04-24  9:33           ` Jean-Philippe Brucker
2018-04-24  9:33             ` Jean-Philippe Brucker
2018-04-24  9:33             ` Jean-Philippe Brucker
     [not found]             ` <66ec18ca-ea4e-d224-c9c5-8dbee5da8a72-5wv7dgnIgG8@public.gmane.org>
2018-04-24 17:17               ` Sinan Kaya
2018-04-24 17:17                 ` Sinan Kaya
2018-04-24 17:17                 ` Sinan Kaya
     [not found]                 ` <e7c4053a-20cc-d2db-16da-100b1157eca4-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-04-24 18:52                   ` Jean-Philippe Brucker via iommu
2018-04-24 18:52                     ` Jean-Philippe Brucker
2018-04-10 18:53     ` Sinan Kaya
2018-04-10 18:53       ` Sinan Kaya
2018-04-10 18:53       ` Sinan Kaya
     [not found]       ` <04d4d161-ed72-f6b6-9b94-1d60bd79ef94-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-04-13 10:59         ` Jean-Philippe Brucker
2018-04-13 10:59           ` Jean-Philippe Brucker
2018-04-13 10:59           ` Jean-Philippe Brucker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.