iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Lu Baolu <baolu.lu@linux.intel.com>
To: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	iommu@lists.linux-foundation.org, devicetree@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org,
	linux-mm@kvack.org
Cc: fenghua.yu@intel.com, kevin.tian@intel.com, jgg@ziepe.ca,
	catalin.marinas@arm.com, robin.murphy@arm.com, hch@infradead.org,
	zhangfei.gao@linaro.org, felix.kuehling@amd.com, will@kernel.org,
	christian.koenig@amd.com
Subject: Re: [PATCH v7 04/24] iommu: Add a page fault handler
Date: Wed, 20 May 2020 14:42:21 +0800	[thread overview]
Message-ID: <c840d771-188d-9ee5-d117-e4b91d29b329@linux.intel.com> (raw)
In-Reply-To: <20200519175502.2504091-5-jean-philippe@linaro.org>

Hi Jean,

On 2020/5/20 1:54, Jean-Philippe Brucker wrote:
> Some systems allow devices to handle I/O Page Faults in the core mm. For
> example systems implementing the PCIe PRI extension or Arm SMMU stall
> model. Infrastructure for reporting these recoverable page faults was
> added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
> fault report API"). Add a page fault handler for host SVA.
> 
> IOMMU driver can now instantiate several fault workqueues and link them
> to IOPF-capable devices. Drivers can choose between a single global
> workqueue, one per IOMMU device, one per low-level fault queue, one per
> domain, etc.
> 
> When it receives a fault event, supposedly in an IRQ handler, the IOMMU
> driver reports the fault using iommu_report_device_fault(), which calls
> the registered handler. The page fault handler then calls the mm fault
> handler, and reports either success or failure with iommu_page_response().
> When the handler succeeded, the IOMMU retries the access.
> 
> The iopf_param pointer could be embedded into iommu_fault_param. But
> putting iopf_param into the iommu_param structure allows us not to care
> about ordering between calls to iopf_queue_add_device() and
> iommu_register_device_fault_handler().
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> v6->v7: Fix leak in iopf_queue_discard_partial()
> ---
>   drivers/iommu/Kconfig      |   4 +
>   drivers/iommu/Makefile     |   1 +
>   include/linux/iommu.h      |  51 +++++
>   drivers/iommu/io-pgfault.c | 459 +++++++++++++++++++++++++++++++++++++
>   4 files changed, 515 insertions(+)
>   create mode 100644 drivers/iommu/io-pgfault.c
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index d9fa5b410015..15e9dc4e503c 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -107,6 +107,10 @@ config IOMMU_SVA
>   	bool
>   	select IOASID
>   
> +config IOMMU_PAGE_FAULT
> +	bool
> +	select IOMMU_SVA
> +
>   config FSL_PAMU
>   	bool "Freescale IOMMU support"
>   	depends on PCI
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 40c800dd4e3e..bf5cb4ee8409 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -4,6 +4,7 @@ obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>   obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>   obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o
>   obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
> +obj-$(CONFIG_IOMMU_PAGE_FAULT) += io-pgfault.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o

[SNIP]

> +
> +static enum iommu_page_response_code
> +iopf_handle_single(struct iopf_fault *iopf)
> +{
> +	vm_fault_t ret;
> +	struct mm_struct *mm;
> +	struct vm_area_struct *vma;
> +	unsigned int access_flags = 0;
> +	unsigned int fault_flags = FAULT_FLAG_REMOTE;
> +	struct iommu_fault_page_request *prm = &iopf->fault.prm;
> +	enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
> +
> +	if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
> +		return status;
> +
> +	mm = iommu_sva_find(prm->pasid);
> +	if (IS_ERR_OR_NULL(mm))
> +		return status;
> +
> +	down_read(&mm->mmap_sem);
> +
> +	vma = find_extend_vma(mm, prm->addr);
> +	if (!vma)
> +		/* Unmapped area */
> +		goto out_put_mm;
> +
> +	if (prm->perm & IOMMU_FAULT_PERM_READ)
> +		access_flags |= VM_READ;
> +
> +	if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
> +		access_flags |= VM_WRITE;
> +		fault_flags |= FAULT_FLAG_WRITE;
> +	}
> +
> +	if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
> +		access_flags |= VM_EXEC;
> +		fault_flags |= FAULT_FLAG_INSTRUCTION;
> +	}
> +
> +	if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
> +		fault_flags |= FAULT_FLAG_USER;
> +
> +	if (access_flags & ~vma->vm_flags)
> +		/* Access fault */
> +		goto out_put_mm;
> +
> +	ret = handle_mm_fault(vma, prm->addr, fault_flags);
> +	status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :

Do you mind telling why it's IOMMU_PAGE_RESP_INVALID but not
IOMMU_PAGE_RESP_FAILURE?

> +		IOMMU_PAGE_RESP_SUCCESS;
> +
> +out_put_mm:
> +	up_read(&mm->mmap_sem);
> +	mmput(mm);
> +
> +	return status;
> +}
> +
> +static void iopf_handle_group(struct work_struct *work)
> +{
> +	struct iopf_group *group;
> +	struct iopf_fault *iopf, *next;
> +	enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
> +
> +	group = container_of(work, struct iopf_group, work);
> +
> +	list_for_each_entry_safe(iopf, next, &group->faults, list) {
> +		/*
> +		 * For the moment, errors are sticky: don't handle subsequent
> +		 * faults in the group if there is an error.
> +		 */
> +		if (status == IOMMU_PAGE_RESP_SUCCESS)
> +			status = iopf_handle_single(iopf);
> +
> +		if (!(iopf->fault.prm.flags &
> +		      IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
> +			kfree(iopf);
> +	}
> +
> +	iopf_complete_group(group->dev, &group->last_fault, status);
> +	kfree(group);
> +}
> +
> +/**
> + * iommu_queue_iopf - IO Page Fault handler
> + * @evt: fault event

@fault?

> + * @cookie: struct device, passed to iommu_register_device_fault_handler.
> + *
> + * Add a fault to the device workqueue, to be handled by mm.
> + *
> + * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard
> + * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
> + * expect a response. It may be generated when disabling a PASID (issuing a
> + * PASID stop request) by some PCI devices.
> + *
> + * The PASID stop request is issued by the device driver before unbind(). Once
> + * it completes, no page request is generated for this PASID anymore and
> + * outstanding ones have been pushed to the IOMMU (as per PCIe 4.0r1.0 - 6.20.1
> + * and 10.4.1.2 - Managing PASID TLP Prefix Usage). Some PCI devices will wait
> + * for all outstanding page requests to come back with a response before
> + * completing the PASID stop request. Others do not wait for page responses, and
> + * instead issue this Stop Marker that tells us when the PASID can be
> + * reallocated.
> + *
> + * It is safe to discard the Stop Marker because it is an optimization.
> + * a. Page requests, which are posted requests, have been flushed to the IOMMU
> + *    when the stop request completes.
> + * b. We flush all fault queues on unbind() before freeing the PASID.
> + *
> + * So even though the Stop Marker might be issued by the device *after* the stop
> + * request completes, outstanding faults will have been dealt with by the time
> + * we free the PASID.
> + *
> + * Return: 0 on success and <0 on error.
> + */
> +int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
> +{
> +	int ret;
> +	struct iopf_group *group;
> +	struct iopf_fault *iopf, *next;
> +	struct iopf_device_param *iopf_param;
> +
> +	struct device *dev = cookie;
> +	struct dev_iommu *param = dev->iommu;
> +
> +	lockdep_assert_held(&param->lock);
> +
> +	if (fault->type != IOMMU_FAULT_PAGE_REQ)
> +		/* Not a recoverable page fault */
> +		return -EOPNOTSUPP;
> +
> +	/*
> +	 * As long as we're holding param->lock, the queue can't be unlinked
> +	 * from the device and therefore cannot disappear.
> +	 */
> +	iopf_param = param->iopf_param;
> +	if (!iopf_param)
> +		return -ENODEV;
> +
> +	if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
> +		iopf = kzalloc(sizeof(*iopf), GFP_KERNEL);
> +		if (!iopf)
> +			return -ENOMEM;
> +
> +		iopf->fault = *fault;
> +
> +		/* Non-last request of a group. Postpone until the last one */
> +		list_add(&iopf->list, &iopf_param->partial);
> +
> +		return 0;
> +	}
> +
> +	group = kzalloc(sizeof(*group), GFP_KERNEL);
> +	if (!group) {
> +		/*
> +		 * The caller will send a response to the hardware. But we do
> +		 * need to clean up before leaving, otherwise partial faults
> +		 * will be stuck.
> +		 */
> +		ret = -ENOMEM;
> +		goto cleanup_partial;
> +	}
> +
> +	group->dev = dev;
> +	group->last_fault.fault = *fault;
> +	INIT_LIST_HEAD(&group->faults);
> +	list_add(&group->last_fault.list, &group->faults);
> +	INIT_WORK(&group->work, iopf_handle_group);
> +
> +	/* See if we have partial faults for this group */
> +	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
> +		if (iopf->fault.prm.grpid == fault->prm.grpid)
> +			/* Insert *before* the last fault */
> +			list_move(&iopf->list, &group->faults);
> +	}
> +
> +	queue_work(iopf_param->queue->wq, &group->work);
> +	return 0;
> +
> +cleanup_partial:
> +	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
> +		if (iopf->fault.prm.grpid == fault->prm.grpid) {
> +			list_del(&iopf->list);
> +			kfree(iopf);
> +		}
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_queue_iopf);

[SNIP]

> + 
> +
> +/**
> + * iopf_queue_add_device - Add producer to the fault queue
> + * @queue: IOPF queue
> + * @dev: device to add
> + *
> + * Return: 0 on success and <0 on error.
> + */
> +int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
> +{
> +	int ret = -EBUSY;
> +	struct iopf_device_param *iopf_param;
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (!param)
> +		return -ENODEV;
> +
> +	iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
> +	if (!iopf_param)
> +		return -ENOMEM;
> +
> +	INIT_LIST_HEAD(&iopf_param->partial);
> +	iopf_param->queue = queue; iopf_param->dev = dev;

Two lines?

> +
> +	mutex_lock(&queue->lock);
> +	mutex_lock(&param->lock);
> +	if (!param->iopf_param) {
> +		list_add(&iopf_param->queue_list, &queue->devices);
> +		param->iopf_param = iopf_param;
> +		ret = 0;
> +	}
> +	mutex_unlock(&param->lock);
> +	mutex_unlock(&queue->lock);
> +
> +	if (ret)
> +		kfree(iopf_param);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iopf_queue_add_device);
> +

[SNIP]
Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2020-05-20  6:42 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-19 17:54 [PATCH v7 00/24] iommu: Shared Virtual Addressing for SMMUv3 Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 01/24] mm: Add a PASID field to mm_struct Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 02/24] iommu/ioasid: Add ioasid references Jean-Philippe Brucker
2020-05-20  2:31   ` Lu Baolu
2020-05-19 17:54 ` [PATCH v7 03/24] iommu/sva: Add PASID helpers Jean-Philippe Brucker
2020-05-20  2:41   ` Lu Baolu
2020-05-19 17:54 ` [PATCH v7 04/24] iommu: Add a page fault handler Jean-Philippe Brucker
2020-05-20  6:42   ` Lu Baolu [this message]
2020-11-11 13:57     ` Jean-Philippe Brucker
2020-11-11 23:11       ` Lu Baolu
2020-05-29  9:18   ` Xiang Zheng
2020-11-11 13:57     ` Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 05/24] arm64: mm: Add asid_gen_match() helper Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 06/24] arm64: mm: Pin down ASIDs for sharing mm with devices Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 07/24] iommu/io-pgtable-arm: Move some definitions to a header Jean-Philippe Brucker
2020-05-21 14:16   ` Will Deacon
2020-05-19 17:54 ` [PATCH v7 08/24] iommu/arm-smmu-v3: Manage ASIDs with xarray Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 09/24] arm64: cpufeature: Export symbol read_sanitised_ftr_reg() Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 10/24] iommu/arm-smmu-v3: Share process page tables Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 11/24] iommu/arm-smmu-v3: Seize private ASID Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 12/24] iommu/arm-smmu-v3: Add support for VHE Jean-Philippe Brucker
2020-05-21 14:16   ` Will Deacon
2020-05-19 17:54 ` [PATCH v7 13/24] iommu/arm-smmu-v3: Enable broadcast TLB maintenance Jean-Philippe Brucker
2020-05-21 14:17   ` Will Deacon
2020-05-21 14:38     ` Marc Zyngier
2020-05-22 10:17       ` Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 14/24] iommu/arm-smmu-v3: Add SVA feature checking Jean-Philippe Brucker
2020-05-21 14:17   ` Will Deacon
2020-05-19 17:54 ` [PATCH v7 15/24] iommu/arm-smmu-v3: Add SVA device feature Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 16/24] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind() Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 17/24] iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 18/24] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update Jean-Philippe Brucker
2020-05-21 11:12   ` Will Deacon
2020-05-27  3:00   ` Xiang Zheng
2020-05-27  8:41     ` Jean-Philippe Brucker
2020-08-28  9:28   ` Zenghui Yu
2020-09-16 14:11     ` Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 19/24] iommu/arm-smmu-v3: Maintain a SID->device structure Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 20/24] dt-bindings: document stall property for IOMMU masters Jean-Philippe Brucker
2020-05-19 17:54 ` [PATCH v7 21/24] iommu/arm-smmu-v3: Add stall support for platform devices Jean-Philippe Brucker
2020-06-01 12:42   ` Shameerali Kolothum Thodi
2020-06-02  9:38     ` Jean-Philippe Brucker
2020-06-02 10:31       ` Shameerali Kolothum Thodi
2020-06-02 11:46         ` Jean-Philippe Brucker
2020-06-02 12:12           ` Shameerali Kolothum Thodi
2020-06-03  7:38             ` Jean-Philippe Brucker
2020-05-19 17:55 ` [PATCH v7 22/24] PCI/ATS: Add PRI stubs Jean-Philippe Brucker
2020-05-19 17:55 ` [PATCH v7 23/24] PCI/ATS: Export PRI functions Jean-Philippe Brucker
2020-05-19 17:55 ` [PATCH v7 24/24] iommu/arm-smmu-v3: Add support for PRI Jean-Philippe Brucker
2020-05-21 10:35 ` [PATCH v7 00/24] iommu: Shared Virtual Addressing for SMMUv3 Will Deacon
2020-05-21 14:17   ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c840d771-188d-9ee5-d117-e4b91d29b329@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=christian.koenig@amd.com \
    --cc=devicetree@vger.kernel.org \
    --cc=felix.kuehling@amd.com \
    --cc=fenghua.yu@intel.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@ziepe.ca \
    --cc=kevin.tian@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).