All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU)
@ 2020-09-30 11:53 ` Alexandre Bailon
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, bjorn.andersson, sumit.semwal, christian.koenig,
	linux-kernel, linux-media, dri-devel, linaro-mm-sig, jstephan,
	stephane.leprovost, gpain, mturquette, Alexandre Bailon

This adds a RPMsg driver that implements communication between the CPU and an
APU.
This uses VirtIO buffer to exchange messages but for sharing data, this uses
a dmabuf, mapped to be shared between CPU (userspace) and APU.
The driver is relatively generic, and should work with any SoC implementing
hardware accelerator for AI if they use support remoteproc and VirtIO.

For the people interested by the firmware or userspace library,
the sources are available here:
https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu

Alexandre Bailon (3):
  Add a RPMSG driver for the APU in the mt8183
  rpmsg: apu_rpmsg: update the way to store IOMMU mapping
  rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping

Julien STEPHAN (1):
  rpmsg: apu_rpmsg: Add support for async apu request

 drivers/rpmsg/Kconfig          |   9 +
 drivers/rpmsg/Makefile         |   1 +
 drivers/rpmsg/apu_rpmsg.c      | 752 +++++++++++++++++++++++++++++++++
 drivers/rpmsg/apu_rpmsg.h      |  52 +++
 include/uapi/linux/apu_rpmsg.h |  47 +++
 5 files changed, 861 insertions(+)
 create mode 100644 drivers/rpmsg/apu_rpmsg.c
 create mode 100644 drivers/rpmsg/apu_rpmsg.h
 create mode 100644 include/uapi/linux/apu_rpmsg.h

-- 
2.26.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU)
@ 2020-09-30 11:53 ` Alexandre Bailon
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, gpain, stephane.leprovost, jstephan, linux-kernel,
	dri-devel, linaro-mm-sig, mturquette, Alexandre Bailon,
	bjorn.andersson, christian.koenig, linux-media

This adds a RPMsg driver that implements communication between the CPU and an
APU.
This uses VirtIO buffer to exchange messages but for sharing data, this uses
a dmabuf, mapped to be shared between CPU (userspace) and APU.
The driver is relatively generic, and should work with any SoC implementing
hardware accelerator for AI if they use support remoteproc and VirtIO.

For the people interested by the firmware or userspace library,
the sources are available here:
https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu

Alexandre Bailon (3):
  Add a RPMSG driver for the APU in the mt8183
  rpmsg: apu_rpmsg: update the way to store IOMMU mapping
  rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping

Julien STEPHAN (1):
  rpmsg: apu_rpmsg: Add support for async apu request

 drivers/rpmsg/Kconfig          |   9 +
 drivers/rpmsg/Makefile         |   1 +
 drivers/rpmsg/apu_rpmsg.c      | 752 +++++++++++++++++++++++++++++++++
 drivers/rpmsg/apu_rpmsg.h      |  52 +++
 include/uapi/linux/apu_rpmsg.h |  47 +++
 5 files changed, 861 insertions(+)
 create mode 100644 drivers/rpmsg/apu_rpmsg.c
 create mode 100644 drivers/rpmsg/apu_rpmsg.h
 create mode 100644 include/uapi/linux/apu_rpmsg.h

-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183
  2020-09-30 11:53 ` Alexandre Bailon
@ 2020-09-30 11:53   ` Alexandre Bailon
  -1 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, bjorn.andersson, sumit.semwal, christian.koenig,
	linux-kernel, linux-media, dri-devel, linaro-mm-sig, jstephan,
	stephane.leprovost, gpain, mturquette, Alexandre Bailon

This adds a driver to communicate with the APU available
in the mt8183. The driver is generic and could be used for other APU.
It mostly provides a userspace interface to send messages and
and share big buffers with the APU.

Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/rpmsg/Kconfig          |   9 +
 drivers/rpmsg/Makefile         |   1 +
 drivers/rpmsg/apu_rpmsg.c      | 606 +++++++++++++++++++++++++++++++++
 drivers/rpmsg/apu_rpmsg.h      |  52 +++
 include/uapi/linux/apu_rpmsg.h |  36 ++
 5 files changed, 704 insertions(+)
 create mode 100644 drivers/rpmsg/apu_rpmsg.c
 create mode 100644 drivers/rpmsg/apu_rpmsg.h
 create mode 100644 include/uapi/linux/apu_rpmsg.h

diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
index f96716893c2a..3437c6fc8647 100644
--- a/drivers/rpmsg/Kconfig
+++ b/drivers/rpmsg/Kconfig
@@ -64,4 +64,13 @@ config RPMSG_VIRTIO
 	select RPMSG
 	select VIRTIO
 
+config RPMSG_APU
+	tristate "APU RPMSG driver"
+	help
+	  This provides a RPMSG driver that provides some facilities to
+	  communicate with an accelerated processing unit (APU).
+	  This creates one or more char files that could be used by userspace
+	  to send a message to an APU. In addition, this also take care of
+	  sharing the memory buffer with the APU.
+
 endmenu
diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
index ffe932ef6050..93e0f3de99c9 100644
--- a/drivers/rpmsg/Makefile
+++ b/drivers/rpmsg/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
 obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
 obj-$(CONFIG_RPMSG_QCOM_SMD)	+= qcom_smd.o
 obj-$(CONFIG_RPMSG_VIRTIO)	+= virtio_rpmsg_bus.o
+obj-$(CONFIG_RPMSG_APU)		+= apu_rpmsg.o
diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
new file mode 100644
index 000000000000..5131b8b8e1f2
--- /dev/null
+++ b/drivers/rpmsg/apu_rpmsg.c
@@ -0,0 +1,606 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Copyright 2020 BayLibre SAS
+
+#include <linux/cdev.h>
+#include <linux/dma-buf.h>
+#include <linux/iommu.h>
+#include <linux/iova.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/remoteproc.h>
+#include <linux/rpmsg.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include "rpmsg_internal.h"
+
+#include <uapi/linux/apu_rpmsg.h>
+
+#include "apu_rpmsg.h"
+
+/* Maximum of APU devices supported */
+#define APU_DEV_MAX 2
+
+#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
+#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
+
+struct rpmsg_apu {
+	struct rpmsg_device *rpdev;
+	struct cdev cdev;
+	struct device dev;
+
+	struct rproc *rproc;
+	struct iommu_domain *domain;
+	struct iova_domain *iovad;
+	int iova_limit_pfn;
+};
+
+struct rpmsg_request {
+	struct completion completion;
+	struct list_head node;
+	void *req;
+};
+
+struct apu_buffer {
+	int fd;
+	struct dma_buf *dma_buf;
+	struct dma_buf_attachment *attachment;
+	struct sg_table *sg_table;
+	u32 iova;
+};
+
+/*
+ * Shared IOVA domain.
+ * The MT8183 has two VP6 core but they are sharing the IOVA.
+ * They could be used alone, or together. In order to avoid conflict,
+ * create an IOVA domain that could be shared by those two core.
+ * @iovad: The IOVA domain to share between the APU cores
+ * @refcount: Allow to automatically release the IOVA domain once all the APU
+ *            cores has been stopped
+ */
+struct apu_iova_domain {
+	struct iova_domain iovad;
+	struct kref refcount;
+};
+
+static dev_t rpmsg_major;
+static DEFINE_IDA(rpmsg_ctrl_ida);
+static DEFINE_IDA(rpmsg_minor_ida);
+static DEFINE_IDA(req_ida);
+static LIST_HEAD(requests);
+static struct apu_iova_domain *apu_iovad;
+
+static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
+			      void *priv, u32 addr)
+{
+	struct rpmsg_request *rpmsg_req;
+	struct apu_dev_request *hdr = data;
+
+	list_for_each_entry(rpmsg_req, &requests, node) {
+		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
+
+		if (hdr->id == tmp_hdr->id) {
+			memcpy(rpmsg_req->req, data, count);
+			complete(&rpmsg_req->completion);
+
+			return 0;
+		}
+	}
+
+	return 0;
+}
+
+static int apu_device_memory_map(struct rpmsg_apu *apu,
+				 struct apu_buffer *buffer)
+{
+	struct rpmsg_device *rpdev = apu->rpdev;
+	phys_addr_t phys;
+	int total_buf_space;
+	int iova_pfn;
+	int ret;
+
+	if (!buffer->fd)
+		return 0;
+
+	buffer->dma_buf = dma_buf_get(buffer->fd);
+	if (IS_ERR(buffer->dma_buf)) {
+		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
+			PTR_ERR(buffer->dma_buf));
+		return PTR_ERR(buffer->dma_buf);
+	}
+
+	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
+	if (IS_ERR(buffer->attachment)) {
+		dev_err(&rpdev->dev, "Failed to attach dma_buf\n");
+		ret = PTR_ERR(buffer->attachment);
+		goto err_dma_buf_put;
+	}
+
+	buffer->sg_table = dma_buf_map_attachment(buffer->attachment,
+						   DMA_BIDIRECTIONAL);
+	if (IS_ERR(buffer->sg_table)) {
+		dev_err(&rpdev->dev, "Failed to map attachment\n");
+		ret = PTR_ERR(buffer->sg_table);
+		goto err_dma_buf_detach;
+	}
+	phys = page_to_phys(sg_page(buffer->sg_table->sgl));
+	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
+
+	iova_pfn = alloc_iova_fast(apu->iovad, total_buf_space >> PAGE_SHIFT,
+				   apu->iova_limit_pfn, true);
+	if (!iova_pfn) {
+		dev_err(&rpdev->dev, "Failed to allocate iova address\n");
+		ret = -ENOMEM;
+		goto err_dma_unmap_attachment;
+	}
+
+	buffer->iova = PFN_PHYS(iova_pfn);
+	ret = iommu_map(apu->rproc->domain, buffer->iova, phys, total_buf_space,
+			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
+	if (ret) {
+		dev_err(&rpdev->dev, "Failed to iommu map\n");
+		goto err_free_iova;
+	}
+
+	return 0;
+
+err_free_iova:
+	free_iova(apu->iovad, iova_pfn);
+err_dma_unmap_attachment:
+	dma_buf_unmap_attachment(buffer->attachment,
+				 buffer->sg_table,
+				 DMA_BIDIRECTIONAL);
+err_dma_buf_detach:
+	dma_buf_detach(buffer->dma_buf, buffer->attachment);
+err_dma_buf_put:
+	dma_buf_put(buffer->dma_buf);
+
+	return ret;
+}
+
+static void apu_device_memory_unmap(struct rpmsg_apu *apu,
+				    struct apu_buffer *buffer)
+{
+	int total_buf_space;
+
+	if (!buffer->fd)
+		return;
+
+	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
+	iommu_unmap(apu->rproc->domain, buffer->iova, total_buf_space);
+	free_iova(apu->iovad, PHYS_PFN(buffer->iova));
+	dma_buf_unmap_attachment(buffer->attachment,
+				 buffer->sg_table,
+				 DMA_BIDIRECTIONAL);
+	dma_buf_detach(buffer->dma_buf, buffer->attachment);
+	dma_buf_put(buffer->dma_buf);
+}
+
+static int _apu_send_request(struct rpmsg_apu *apu,
+			     struct rpmsg_device *rpdev,
+			     struct apu_dev_request *req, int len)
+{
+
+	struct rpmsg_request *rpmsg_req;
+	int ret = 0;
+
+	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
+	if (req->id < 0)
+		return ret;
+
+	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
+	if (!rpmsg_req)
+		return -ENOMEM;
+
+	rpmsg_req->req = req;
+	init_completion(&rpmsg_req->completion);
+	list_add(&rpmsg_req->node, &requests);
+
+	ret = rpmsg_send(rpdev->ept, req, len);
+	if (ret)
+		goto free_req;
+
+	/* be careful with race here between timeout and callback*/
+	ret = wait_for_completion_timeout(&rpmsg_req->completion,
+					  msecs_to_jiffies(1000));
+	if (!ret)
+		ret = -ETIMEDOUT;
+	else
+		ret = 0;
+
+	ida_simple_remove(&req_ida, req->id);
+
+free_req:
+
+	list_del(&rpmsg_req->node);
+	kfree(rpmsg_req);
+
+	return ret;
+}
+
+static int apu_send_request(struct rpmsg_apu *apu,
+			    struct apu_request *req)
+{
+	int ret;
+	struct rpmsg_device *rpdev = apu->rpdev;
+	struct apu_dev_request *dev_req;
+	struct apu_buffer *buffer;
+
+	int size = req->size_in + req->size_out +
+		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
+	u32 *fd = (u32 *)(req->data + req->size_in + req->size_out);
+	u32 *buffer_size = (u32 *)(fd + req->count);
+	u32 *dev_req_da;
+	u32 *dev_req_buffer_size;
+	int i;
+
+	dev_req = kmalloc(size, GFP_KERNEL);
+	if (!dev_req)
+		return -ENOMEM;
+
+	dev_req->cmd = req->cmd;
+	dev_req->size_in = req->size_in;
+	dev_req->size_out = req->size_out;
+	dev_req->count = req->count;
+	dev_req_da = (u32 *)(dev_req->data + req->size_in + req->size_out);
+	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
+	memcpy(dev_req->data, req->data, req->size_in);
+
+	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
+	for (i = 0; i < req->count; i++) {
+		buffer[i].fd = fd[i];
+		ret = apu_device_memory_map(apu, &buffer[i]);
+		if (ret)
+			goto err_free_memory;
+		dev_req_da[i] = buffer[i].iova;
+		dev_req_buffer_size[i] = buffer_size[i];
+	}
+
+	ret = _apu_send_request(apu, rpdev, dev_req, size);
+
+err_free_memory:
+	for (i--; i >= 0; i--)
+		apu_device_memory_unmap(apu, &buffer[i]);
+
+	req->result = dev_req->result;
+	req->size_in = dev_req->size_in;
+	req->size_out = dev_req->size_out;
+	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
+	       sizeof(u32) * req->count);
+
+	kfree(buffer);
+	kfree(dev_req);
+
+	return ret;
+}
+
+
+static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
+			       unsigned long arg)
+{
+	struct rpmsg_apu *apu = fp->private_data;
+	struct apu_request apu_req;
+	struct apu_request *apu_req_full;
+	void __user *argp = (void __user *)arg;
+	int len;
+	int ret;
+
+	switch (cmd) {
+	case APU_SEND_REQ_IOCTL:
+		/* Get the header */
+		if (copy_from_user(&apu_req, argp,
+				   sizeof(apu_req)))
+			return -EFAULT;
+
+		len = sizeof(*apu_req_full) + apu_req.size_in +
+			apu_req.size_out + apu_req.count * sizeof(u32) * 2;
+		apu_req_full = kzalloc(len, GFP_KERNEL);
+		if (!apu_req_full)
+			return -ENOMEM;
+
+		/* Get the whole request */
+		if (copy_from_user(apu_req_full, argp, len)) {
+			kfree(apu_req_full);
+			return -EFAULT;
+		}
+
+		ret = apu_send_request(apu, apu_req_full);
+		if (ret) {
+			kfree(apu_req_full);
+			return ret;
+		}
+
+		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
+				 sizeof(u32) * apu_req_full->count +
+				 apu_req_full->size_in + apu_req_full->size_out))
+			ret = -EFAULT;
+
+		kfree(apu_req_full);
+		return ret;
+
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
+{
+	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
+
+	get_device(&apu->dev);
+	filp->private_data = apu;
+
+	return 0;
+}
+
+static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
+{
+	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
+
+	put_device(&apu->dev);
+
+	return 0;
+}
+
+static const struct file_operations rpmsg_eptdev_fops = {
+	.owner = THIS_MODULE,
+	.open = rpmsg_eptdev_open,
+	.release = rpmsg_eptdev_release,
+	.unlocked_ioctl = rpmsg_eptdev_ioctl,
+	.compat_ioctl = rpmsg_eptdev_ioctl,
+};
+
+static void iova_domain_release(struct kref *ref)
+{
+	put_iova_domain(&apu_iovad->iovad);
+	kfree(apu_iovad);
+	apu_iovad = NULL;
+}
+
+static struct fw_rsc_iova *apu_find_rcs_iova(struct rpmsg_apu *apu)
+{
+	struct rproc *rproc = apu->rproc;
+	struct resource_table *table;
+	struct fw_rsc_iova *rsc;
+	int i;
+
+	table = rproc->table_ptr;
+	for (i = 0; i < table->num; i++) {
+		int offset = table->offset[i];
+		struct fw_rsc_hdr *hdr = (void *)table + offset;
+
+		switch (hdr->type) {
+		case RSC_VENDOR_IOVA:
+			rsc = (void *)hdr + sizeof(*hdr);
+				return rsc;
+			break;
+		default:
+			continue;
+		}
+	}
+
+	return NULL;
+}
+
+static int apu_reserve_iova(struct rpmsg_apu *apu, struct iova_domain *iovad)
+{
+	struct rproc *rproc = apu->rproc;
+	struct resource_table *table;
+	struct fw_rsc_carveout *rsc;
+	int i;
+
+	table = rproc->table_ptr;
+	for (i = 0; i < table->num; i++) {
+		int offset = table->offset[i];
+		struct fw_rsc_hdr *hdr = (void *)table + offset;
+
+		if (hdr->type == RSC_CARVEOUT) {
+			struct iova *iova;
+
+			rsc = (void *)hdr + sizeof(*hdr);
+			iova = reserve_iova(iovad, PHYS_PFN(rsc->da),
+					    PHYS_PFN(rsc->da + rsc->len));
+			if (!iova) {
+				dev_err(&apu->dev, "failed to reserve iova\n");
+				return -ENOMEM;
+			}
+			dev_dbg(&apu->dev, "Reserve: %x - %x\n",
+				rsc->da, rsc->da + rsc->len);
+		}
+	}
+
+	return 0;
+}
+
+static int apu_init_iovad(struct rpmsg_apu *apu)
+{
+	struct fw_rsc_iova *rsc;
+
+	if (!apu->rproc->table_ptr) {
+		dev_err(&apu->dev,
+			"No resource_table: has the firmware been loaded ?\n");
+		return -ENODEV;
+	}
+
+	rsc = apu_find_rcs_iova(apu);
+	if (!rsc) {
+		dev_err(&apu->dev, "No iova range defined in resource_table\n");
+		return -ENOMEM;
+	}
+
+	if (!apu_iovad) {
+		apu_iovad = kzalloc(sizeof(*apu_iovad), GFP_KERNEL);
+		if (!apu_iovad)
+			return -ENOMEM;
+
+		init_iova_domain(&apu_iovad->iovad, PAGE_SIZE,
+				 PHYS_PFN(rsc->da));
+		apu_reserve_iova(apu, &apu_iovad->iovad);
+		kref_init(&apu_iovad->refcount);
+	} else
+		kref_get(&apu_iovad->refcount);
+
+	apu->iovad = &apu_iovad->iovad;
+	apu->iova_limit_pfn = PHYS_PFN(rsc->da + rsc->len) - 1;
+
+	return 0;
+}
+
+static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
+{
+	/*
+	 * To work, the APU RPMsg driver need to get the rproc device.
+	 * Currently, we only use virtio so we could use that to find the
+	 * remoteproc parent.
+	 */
+	if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
+		dev_err(&rpdev->dev, "invalid rpmsg device\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
+		dev_err(&rpdev->dev, "unsupported bus\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
+}
+
+static void rpmsg_apu_release_device(struct device *dev)
+{
+	struct rpmsg_apu *apu = dev_to_apu(dev);
+
+	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
+	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
+	cdev_del(&apu->cdev);
+	kfree(apu);
+}
+
+static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
+{
+	struct rpmsg_apu *apu;
+	struct device *dev;
+	int ret;
+
+	apu = devm_kzalloc(&rpdev->dev, sizeof(*apu), GFP_KERNEL);
+	if (!apu)
+		return -ENOMEM;
+	apu->rpdev = rpdev;
+
+	apu->rproc = apu_get_rproc(rpdev);
+	if (IS_ERR_OR_NULL(apu->rproc))
+		return PTR_ERR(apu->rproc);
+
+	dev = &apu->dev;
+	device_initialize(dev);
+	dev->parent = &rpdev->dev;
+
+	cdev_init(&apu->cdev, &rpmsg_eptdev_fops);
+	apu->cdev.owner = THIS_MODULE;
+
+	ret = ida_simple_get(&rpmsg_minor_ida, 0, APU_DEV_MAX, GFP_KERNEL);
+	if (ret < 0)
+		goto free_apu;
+	dev->devt = MKDEV(MAJOR(rpmsg_major), ret);
+
+	ret = ida_simple_get(&rpmsg_ctrl_ida, 0, 0, GFP_KERNEL);
+	if (ret < 0)
+		goto free_minor_ida;
+	dev->id = ret;
+	dev_set_name(&apu->dev, "apu%d", ret);
+
+	ret = cdev_add(&apu->cdev, dev->devt, 1);
+	if (ret)
+		goto free_ctrl_ida;
+
+	/* We can now rely on the release function for cleanup */
+	dev->release = rpmsg_apu_release_device;
+
+	ret = device_add(dev);
+	if (ret) {
+		dev_err(&rpdev->dev, "device_add failed: %d\n", ret);
+		put_device(dev);
+	}
+
+	/* Make device dma capable by inheriting from parent's capabilities */
+	set_dma_ops(&rpdev->dev, get_dma_ops(apu->rproc->dev.parent));
+
+	ret = dma_coerce_mask_and_coherent(&rpdev->dev,
+					   dma_get_mask(apu->rproc->dev.parent));
+	if (ret)
+		goto err_put_device;
+
+	rpdev->dev.iommu_group = apu->rproc->dev.parent->iommu_group;
+
+	ret = apu_init_iovad(apu);
+
+	dev_set_drvdata(&rpdev->dev, apu);
+
+	return ret;
+
+err_put_device:
+	put_device(dev);
+free_ctrl_ida:
+	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
+free_minor_ida:
+	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
+free_apu:
+	put_device(dev);
+	kfree(apu);
+
+	return ret;
+}
+
+static void apu_rpmsg_remove(struct rpmsg_device *rpdev)
+{
+	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
+
+	if (apu_iovad)
+		kref_put(&apu_iovad->refcount, iova_domain_release);
+
+	device_del(&apu->dev);
+	put_device(&apu->dev);
+	kfree(apu);
+}
+
+static const struct rpmsg_device_id apu_rpmsg_match[] = {
+	{ APU_RPMSG_SERVICE_MT8183 },
+	{}
+};
+
+static struct rpmsg_driver apu_rpmsg_driver = {
+	.probe = apu_rpmsg_probe,
+	.remove = apu_rpmsg_remove,
+	.callback = apu_rpmsg_callback,
+	.id_table = apu_rpmsg_match,
+	.drv  = {
+		.name  = "apu_rpmsg",
+	},
+};
+
+static int __init apu_rpmsg_init(void)
+{
+	int ret;
+
+	ret = alloc_chrdev_region(&rpmsg_major, 0, APU_DEV_MAX, "apu");
+	if (ret < 0) {
+		pr_err("apu: failed to allocate char dev region\n");
+		return ret;
+	}
+
+	return register_rpmsg_driver(&apu_rpmsg_driver);
+}
+arch_initcall(apu_rpmsg_init);
+
+static void __exit apu_rpmsg_exit(void)
+{
+	unregister_rpmsg_driver(&apu_rpmsg_driver);
+}
+module_exit(apu_rpmsg_exit);
+
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("APU RPMSG driver");
diff --git a/drivers/rpmsg/apu_rpmsg.h b/drivers/rpmsg/apu_rpmsg.h
new file mode 100644
index 000000000000..54b5b7880750
--- /dev/null
+++ b/drivers/rpmsg/apu_rpmsg.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright 2020 BayLibre SAS
+ */
+
+#ifndef __APU_RPMSG_H__
+#define __APU_RPMSG_H__
+
+/*
+ * Firmware request, must be aligned with the one defined in firmware.
+ * @id: Request id, used in the case of reply, to find the pending request
+ * @cmd: The command id to execute in the firmware
+ * @result: The result of the command executed on the firmware
+ * @size: The size of the data available in this request
+ * @count: The number of shared buffer
+ * @data: Contains the data attached with the request if size is greater than
+ *        zero, and the addresses of shared buffers if count is greater than
+ *        zero. Both the data and the shared buffer could be read and write
+ *        by the APU.
+ */
+struct  apu_dev_request {
+	u16 id;
+	u16 cmd;
+	u16 result;
+	u16 size_in;
+	u16 size_out;
+	u16 count;
+	u8 data[0];
+} __packed;
+
+#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
+#define APU_CTRL_SRC 1
+#define APU_CTRL_DST 1
+
+/* Vendor specific resource table entry */
+#define RSC_VENDOR_IOVA 128
+
+/*
+ * Firmware IOVA resource table entry
+ * Define a range of virtual device address that could mapped using the IOMMU.
+ * @da: Start virtual device address
+ * @len: Length of the virtual device address
+ * @name: name of the resource
+ */
+struct fw_rsc_iova {
+	u32 da;
+	u32 len;
+	u32 reserved;
+	u8 name[32];
+} __packed;
+
+#endif /* __APU_RPMSG_H__ */
diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
new file mode 100644
index 000000000000..81c9e4af9a94
--- /dev/null
+++ b/include/uapi/linux/apu_rpmsg.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright (c) 2020 BayLibre
+ */
+
+#ifndef _UAPI_RPMSG_APU_H_
+#define _UAPI_RPMSG_APU_H_
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/*
+ * Structure containing the APU request from userspace application
+ * @cmd: The id of the command to execute on the APU
+ * @result: The result of the command executed on the APU
+ * @size: The size of the data available in this request
+ * @count: The number of shared buffer
+ * @data: Contains the data attached with the request if size is greater than
+ *        zero, and the files descriptors of shared buffers if count is greater
+ *        than zero. Both the data and the shared buffer could be read and write
+ *        by the APU.
+ */
+struct apu_request {
+	__u16 cmd;
+	__u16 result;
+	__u16 size_in;
+	__u16 size_out;
+	__u16 count;
+	__u16 reserved;
+	__u8 data[0];
+};
+
+/* Send synchronous request to an APU */
+#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
+
+#endif
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183
@ 2020-09-30 11:53   ` Alexandre Bailon
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, gpain, stephane.leprovost, jstephan, linux-kernel,
	dri-devel, linaro-mm-sig, mturquette, Alexandre Bailon,
	bjorn.andersson, christian.koenig, linux-media

This adds a driver to communicate with the APU available
in the mt8183. The driver is generic and could be used for other APU.
It mostly provides a userspace interface to send messages and
and share big buffers with the APU.

Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/rpmsg/Kconfig          |   9 +
 drivers/rpmsg/Makefile         |   1 +
 drivers/rpmsg/apu_rpmsg.c      | 606 +++++++++++++++++++++++++++++++++
 drivers/rpmsg/apu_rpmsg.h      |  52 +++
 include/uapi/linux/apu_rpmsg.h |  36 ++
 5 files changed, 704 insertions(+)
 create mode 100644 drivers/rpmsg/apu_rpmsg.c
 create mode 100644 drivers/rpmsg/apu_rpmsg.h
 create mode 100644 include/uapi/linux/apu_rpmsg.h

diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
index f96716893c2a..3437c6fc8647 100644
--- a/drivers/rpmsg/Kconfig
+++ b/drivers/rpmsg/Kconfig
@@ -64,4 +64,13 @@ config RPMSG_VIRTIO
 	select RPMSG
 	select VIRTIO
 
+config RPMSG_APU
+	tristate "APU RPMSG driver"
+	help
+	  This provides a RPMSG driver that provides some facilities to
+	  communicate with an accelerated processing unit (APU).
+	  This creates one or more char files that could be used by userspace
+	  to send a message to an APU. In addition, this also take care of
+	  sharing the memory buffer with the APU.
+
 endmenu
diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
index ffe932ef6050..93e0f3de99c9 100644
--- a/drivers/rpmsg/Makefile
+++ b/drivers/rpmsg/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
 obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
 obj-$(CONFIG_RPMSG_QCOM_SMD)	+= qcom_smd.o
 obj-$(CONFIG_RPMSG_VIRTIO)	+= virtio_rpmsg_bus.o
+obj-$(CONFIG_RPMSG_APU)		+= apu_rpmsg.o
diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
new file mode 100644
index 000000000000..5131b8b8e1f2
--- /dev/null
+++ b/drivers/rpmsg/apu_rpmsg.c
@@ -0,0 +1,606 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Copyright 2020 BayLibre SAS
+
+#include <linux/cdev.h>
+#include <linux/dma-buf.h>
+#include <linux/iommu.h>
+#include <linux/iova.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/remoteproc.h>
+#include <linux/rpmsg.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include "rpmsg_internal.h"
+
+#include <uapi/linux/apu_rpmsg.h>
+
+#include "apu_rpmsg.h"
+
+/* Maximum of APU devices supported */
+#define APU_DEV_MAX 2
+
+#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
+#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
+
+struct rpmsg_apu {
+	struct rpmsg_device *rpdev;
+	struct cdev cdev;
+	struct device dev;
+
+	struct rproc *rproc;
+	struct iommu_domain *domain;
+	struct iova_domain *iovad;
+	int iova_limit_pfn;
+};
+
+struct rpmsg_request {
+	struct completion completion;
+	struct list_head node;
+	void *req;
+};
+
+struct apu_buffer {
+	int fd;
+	struct dma_buf *dma_buf;
+	struct dma_buf_attachment *attachment;
+	struct sg_table *sg_table;
+	u32 iova;
+};
+
+/*
+ * Shared IOVA domain.
+ * The MT8183 has two VP6 core but they are sharing the IOVA.
+ * They could be used alone, or together. In order to avoid conflict,
+ * create an IOVA domain that could be shared by those two core.
+ * @iovad: The IOVA domain to share between the APU cores
+ * @refcount: Allow to automatically release the IOVA domain once all the APU
+ *            cores has been stopped
+ */
+struct apu_iova_domain {
+	struct iova_domain iovad;
+	struct kref refcount;
+};
+
+static dev_t rpmsg_major;
+static DEFINE_IDA(rpmsg_ctrl_ida);
+static DEFINE_IDA(rpmsg_minor_ida);
+static DEFINE_IDA(req_ida);
+static LIST_HEAD(requests);
+static struct apu_iova_domain *apu_iovad;
+
+static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
+			      void *priv, u32 addr)
+{
+	struct rpmsg_request *rpmsg_req;
+	struct apu_dev_request *hdr = data;
+
+	list_for_each_entry(rpmsg_req, &requests, node) {
+		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
+
+		if (hdr->id == tmp_hdr->id) {
+			memcpy(rpmsg_req->req, data, count);
+			complete(&rpmsg_req->completion);
+
+			return 0;
+		}
+	}
+
+	return 0;
+}
+
+static int apu_device_memory_map(struct rpmsg_apu *apu,
+				 struct apu_buffer *buffer)
+{
+	struct rpmsg_device *rpdev = apu->rpdev;
+	phys_addr_t phys;
+	int total_buf_space;
+	int iova_pfn;
+	int ret;
+
+	if (!buffer->fd)
+		return 0;
+
+	buffer->dma_buf = dma_buf_get(buffer->fd);
+	if (IS_ERR(buffer->dma_buf)) {
+		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
+			PTR_ERR(buffer->dma_buf));
+		return PTR_ERR(buffer->dma_buf);
+	}
+
+	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
+	if (IS_ERR(buffer->attachment)) {
+		dev_err(&rpdev->dev, "Failed to attach dma_buf\n");
+		ret = PTR_ERR(buffer->attachment);
+		goto err_dma_buf_put;
+	}
+
+	buffer->sg_table = dma_buf_map_attachment(buffer->attachment,
+						   DMA_BIDIRECTIONAL);
+	if (IS_ERR(buffer->sg_table)) {
+		dev_err(&rpdev->dev, "Failed to map attachment\n");
+		ret = PTR_ERR(buffer->sg_table);
+		goto err_dma_buf_detach;
+	}
+	phys = page_to_phys(sg_page(buffer->sg_table->sgl));
+	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
+
+	iova_pfn = alloc_iova_fast(apu->iovad, total_buf_space >> PAGE_SHIFT,
+				   apu->iova_limit_pfn, true);
+	if (!iova_pfn) {
+		dev_err(&rpdev->dev, "Failed to allocate iova address\n");
+		ret = -ENOMEM;
+		goto err_dma_unmap_attachment;
+	}
+
+	buffer->iova = PFN_PHYS(iova_pfn);
+	ret = iommu_map(apu->rproc->domain, buffer->iova, phys, total_buf_space,
+			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
+	if (ret) {
+		dev_err(&rpdev->dev, "Failed to iommu map\n");
+		goto err_free_iova;
+	}
+
+	return 0;
+
+err_free_iova:
+	free_iova(apu->iovad, iova_pfn);
+err_dma_unmap_attachment:
+	dma_buf_unmap_attachment(buffer->attachment,
+				 buffer->sg_table,
+				 DMA_BIDIRECTIONAL);
+err_dma_buf_detach:
+	dma_buf_detach(buffer->dma_buf, buffer->attachment);
+err_dma_buf_put:
+	dma_buf_put(buffer->dma_buf);
+
+	return ret;
+}
+
+static void apu_device_memory_unmap(struct rpmsg_apu *apu,
+				    struct apu_buffer *buffer)
+{
+	int total_buf_space;
+
+	if (!buffer->fd)
+		return;
+
+	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
+	iommu_unmap(apu->rproc->domain, buffer->iova, total_buf_space);
+	free_iova(apu->iovad, PHYS_PFN(buffer->iova));
+	dma_buf_unmap_attachment(buffer->attachment,
+				 buffer->sg_table,
+				 DMA_BIDIRECTIONAL);
+	dma_buf_detach(buffer->dma_buf, buffer->attachment);
+	dma_buf_put(buffer->dma_buf);
+}
+
+static int _apu_send_request(struct rpmsg_apu *apu,
+			     struct rpmsg_device *rpdev,
+			     struct apu_dev_request *req, int len)
+{
+
+	struct rpmsg_request *rpmsg_req;
+	int ret = 0;
+
+	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
+	if (req->id < 0)
+		return ret;
+
+	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
+	if (!rpmsg_req)
+		return -ENOMEM;
+
+	rpmsg_req->req = req;
+	init_completion(&rpmsg_req->completion);
+	list_add(&rpmsg_req->node, &requests);
+
+	ret = rpmsg_send(rpdev->ept, req, len);
+	if (ret)
+		goto free_req;
+
+	/* be careful with race here between timeout and callback*/
+	ret = wait_for_completion_timeout(&rpmsg_req->completion,
+					  msecs_to_jiffies(1000));
+	if (!ret)
+		ret = -ETIMEDOUT;
+	else
+		ret = 0;
+
+	ida_simple_remove(&req_ida, req->id);
+
+free_req:
+
+	list_del(&rpmsg_req->node);
+	kfree(rpmsg_req);
+
+	return ret;
+}
+
+static int apu_send_request(struct rpmsg_apu *apu,
+			    struct apu_request *req)
+{
+	int ret;
+	struct rpmsg_device *rpdev = apu->rpdev;
+	struct apu_dev_request *dev_req;
+	struct apu_buffer *buffer;
+
+	int size = req->size_in + req->size_out +
+		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
+	u32 *fd = (u32 *)(req->data + req->size_in + req->size_out);
+	u32 *buffer_size = (u32 *)(fd + req->count);
+	u32 *dev_req_da;
+	u32 *dev_req_buffer_size;
+	int i;
+
+	dev_req = kmalloc(size, GFP_KERNEL);
+	if (!dev_req)
+		return -ENOMEM;
+
+	dev_req->cmd = req->cmd;
+	dev_req->size_in = req->size_in;
+	dev_req->size_out = req->size_out;
+	dev_req->count = req->count;
+	dev_req_da = (u32 *)(dev_req->data + req->size_in + req->size_out);
+	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
+	memcpy(dev_req->data, req->data, req->size_in);
+
+	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
+	for (i = 0; i < req->count; i++) {
+		buffer[i].fd = fd[i];
+		ret = apu_device_memory_map(apu, &buffer[i]);
+		if (ret)
+			goto err_free_memory;
+		dev_req_da[i] = buffer[i].iova;
+		dev_req_buffer_size[i] = buffer_size[i];
+	}
+
+	ret = _apu_send_request(apu, rpdev, dev_req, size);
+
+err_free_memory:
+	for (i--; i >= 0; i--)
+		apu_device_memory_unmap(apu, &buffer[i]);
+
+	req->result = dev_req->result;
+	req->size_in = dev_req->size_in;
+	req->size_out = dev_req->size_out;
+	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
+	       sizeof(u32) * req->count);
+
+	kfree(buffer);
+	kfree(dev_req);
+
+	return ret;
+}
+
+
+static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
+			       unsigned long arg)
+{
+	struct rpmsg_apu *apu = fp->private_data;
+	struct apu_request apu_req;
+	struct apu_request *apu_req_full;
+	void __user *argp = (void __user *)arg;
+	int len;
+	int ret;
+
+	switch (cmd) {
+	case APU_SEND_REQ_IOCTL:
+		/* Get the header */
+		if (copy_from_user(&apu_req, argp,
+				   sizeof(apu_req)))
+			return -EFAULT;
+
+		len = sizeof(*apu_req_full) + apu_req.size_in +
+			apu_req.size_out + apu_req.count * sizeof(u32) * 2;
+		apu_req_full = kzalloc(len, GFP_KERNEL);
+		if (!apu_req_full)
+			return -ENOMEM;
+
+		/* Get the whole request */
+		if (copy_from_user(apu_req_full, argp, len)) {
+			kfree(apu_req_full);
+			return -EFAULT;
+		}
+
+		ret = apu_send_request(apu, apu_req_full);
+		if (ret) {
+			kfree(apu_req_full);
+			return ret;
+		}
+
+		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
+				 sizeof(u32) * apu_req_full->count +
+				 apu_req_full->size_in + apu_req_full->size_out))
+			ret = -EFAULT;
+
+		kfree(apu_req_full);
+		return ret;
+
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
+{
+	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
+
+	get_device(&apu->dev);
+	filp->private_data = apu;
+
+	return 0;
+}
+
+static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
+{
+	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
+
+	put_device(&apu->dev);
+
+	return 0;
+}
+
+static const struct file_operations rpmsg_eptdev_fops = {
+	.owner = THIS_MODULE,
+	.open = rpmsg_eptdev_open,
+	.release = rpmsg_eptdev_release,
+	.unlocked_ioctl = rpmsg_eptdev_ioctl,
+	.compat_ioctl = rpmsg_eptdev_ioctl,
+};
+
+static void iova_domain_release(struct kref *ref)
+{
+	put_iova_domain(&apu_iovad->iovad);
+	kfree(apu_iovad);
+	apu_iovad = NULL;
+}
+
+static struct fw_rsc_iova *apu_find_rcs_iova(struct rpmsg_apu *apu)
+{
+	struct rproc *rproc = apu->rproc;
+	struct resource_table *table;
+	struct fw_rsc_iova *rsc;
+	int i;
+
+	table = rproc->table_ptr;
+	for (i = 0; i < table->num; i++) {
+		int offset = table->offset[i];
+		struct fw_rsc_hdr *hdr = (void *)table + offset;
+
+		switch (hdr->type) {
+		case RSC_VENDOR_IOVA:
+			rsc = (void *)hdr + sizeof(*hdr);
+				return rsc;
+			break;
+		default:
+			continue;
+		}
+	}
+
+	return NULL;
+}
+
+static int apu_reserve_iova(struct rpmsg_apu *apu, struct iova_domain *iovad)
+{
+	struct rproc *rproc = apu->rproc;
+	struct resource_table *table;
+	struct fw_rsc_carveout *rsc;
+	int i;
+
+	table = rproc->table_ptr;
+	for (i = 0; i < table->num; i++) {
+		int offset = table->offset[i];
+		struct fw_rsc_hdr *hdr = (void *)table + offset;
+
+		if (hdr->type == RSC_CARVEOUT) {
+			struct iova *iova;
+
+			rsc = (void *)hdr + sizeof(*hdr);
+			iova = reserve_iova(iovad, PHYS_PFN(rsc->da),
+					    PHYS_PFN(rsc->da + rsc->len));
+			if (!iova) {
+				dev_err(&apu->dev, "failed to reserve iova\n");
+				return -ENOMEM;
+			}
+			dev_dbg(&apu->dev, "Reserve: %x - %x\n",
+				rsc->da, rsc->da + rsc->len);
+		}
+	}
+
+	return 0;
+}
+
+static int apu_init_iovad(struct rpmsg_apu *apu)
+{
+	struct fw_rsc_iova *rsc;
+
+	if (!apu->rproc->table_ptr) {
+		dev_err(&apu->dev,
+			"No resource_table: has the firmware been loaded ?\n");
+		return -ENODEV;
+	}
+
+	rsc = apu_find_rcs_iova(apu);
+	if (!rsc) {
+		dev_err(&apu->dev, "No iova range defined in resource_table\n");
+		return -ENOMEM;
+	}
+
+	if (!apu_iovad) {
+		apu_iovad = kzalloc(sizeof(*apu_iovad), GFP_KERNEL);
+		if (!apu_iovad)
+			return -ENOMEM;
+
+		init_iova_domain(&apu_iovad->iovad, PAGE_SIZE,
+				 PHYS_PFN(rsc->da));
+		apu_reserve_iova(apu, &apu_iovad->iovad);
+		kref_init(&apu_iovad->refcount);
+	} else
+		kref_get(&apu_iovad->refcount);
+
+	apu->iovad = &apu_iovad->iovad;
+	apu->iova_limit_pfn = PHYS_PFN(rsc->da + rsc->len) - 1;
+
+	return 0;
+}
+
+static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
+{
+	/*
+	 * To work, the APU RPMsg driver need to get the rproc device.
+	 * Currently, we only use virtio so we could use that to find the
+	 * remoteproc parent.
+	 */
+	if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
+		dev_err(&rpdev->dev, "invalid rpmsg device\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
+		dev_err(&rpdev->dev, "unsupported bus\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
+}
+
+static void rpmsg_apu_release_device(struct device *dev)
+{
+	struct rpmsg_apu *apu = dev_to_apu(dev);
+
+	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
+	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
+	cdev_del(&apu->cdev);
+	kfree(apu);
+}
+
+static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
+{
+	struct rpmsg_apu *apu;
+	struct device *dev;
+	int ret;
+
+	apu = devm_kzalloc(&rpdev->dev, sizeof(*apu), GFP_KERNEL);
+	if (!apu)
+		return -ENOMEM;
+	apu->rpdev = rpdev;
+
+	apu->rproc = apu_get_rproc(rpdev);
+	if (IS_ERR_OR_NULL(apu->rproc))
+		return PTR_ERR(apu->rproc);
+
+	dev = &apu->dev;
+	device_initialize(dev);
+	dev->parent = &rpdev->dev;
+
+	cdev_init(&apu->cdev, &rpmsg_eptdev_fops);
+	apu->cdev.owner = THIS_MODULE;
+
+	ret = ida_simple_get(&rpmsg_minor_ida, 0, APU_DEV_MAX, GFP_KERNEL);
+	if (ret < 0)
+		goto free_apu;
+	dev->devt = MKDEV(MAJOR(rpmsg_major), ret);
+
+	ret = ida_simple_get(&rpmsg_ctrl_ida, 0, 0, GFP_KERNEL);
+	if (ret < 0)
+		goto free_minor_ida;
+	dev->id = ret;
+	dev_set_name(&apu->dev, "apu%d", ret);
+
+	ret = cdev_add(&apu->cdev, dev->devt, 1);
+	if (ret)
+		goto free_ctrl_ida;
+
+	/* We can now rely on the release function for cleanup */
+	dev->release = rpmsg_apu_release_device;
+
+	ret = device_add(dev);
+	if (ret) {
+		dev_err(&rpdev->dev, "device_add failed: %d\n", ret);
+		put_device(dev);
+	}
+
+	/* Make device dma capable by inheriting from parent's capabilities */
+	set_dma_ops(&rpdev->dev, get_dma_ops(apu->rproc->dev.parent));
+
+	ret = dma_coerce_mask_and_coherent(&rpdev->dev,
+					   dma_get_mask(apu->rproc->dev.parent));
+	if (ret)
+		goto err_put_device;
+
+	rpdev->dev.iommu_group = apu->rproc->dev.parent->iommu_group;
+
+	ret = apu_init_iovad(apu);
+
+	dev_set_drvdata(&rpdev->dev, apu);
+
+	return ret;
+
+err_put_device:
+	put_device(dev);
+free_ctrl_ida:
+	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
+free_minor_ida:
+	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
+free_apu:
+	put_device(dev);
+	kfree(apu);
+
+	return ret;
+}
+
+static void apu_rpmsg_remove(struct rpmsg_device *rpdev)
+{
+	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
+
+	if (apu_iovad)
+		kref_put(&apu_iovad->refcount, iova_domain_release);
+
+	device_del(&apu->dev);
+	put_device(&apu->dev);
+	kfree(apu);
+}
+
+static const struct rpmsg_device_id apu_rpmsg_match[] = {
+	{ APU_RPMSG_SERVICE_MT8183 },
+	{}
+};
+
+static struct rpmsg_driver apu_rpmsg_driver = {
+	.probe = apu_rpmsg_probe,
+	.remove = apu_rpmsg_remove,
+	.callback = apu_rpmsg_callback,
+	.id_table = apu_rpmsg_match,
+	.drv  = {
+		.name  = "apu_rpmsg",
+	},
+};
+
+static int __init apu_rpmsg_init(void)
+{
+	int ret;
+
+	ret = alloc_chrdev_region(&rpmsg_major, 0, APU_DEV_MAX, "apu");
+	if (ret < 0) {
+		pr_err("apu: failed to allocate char dev region\n");
+		return ret;
+	}
+
+	return register_rpmsg_driver(&apu_rpmsg_driver);
+}
+arch_initcall(apu_rpmsg_init);
+
+static void __exit apu_rpmsg_exit(void)
+{
+	unregister_rpmsg_driver(&apu_rpmsg_driver);
+}
+module_exit(apu_rpmsg_exit);
+
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("APU RPMSG driver");
diff --git a/drivers/rpmsg/apu_rpmsg.h b/drivers/rpmsg/apu_rpmsg.h
new file mode 100644
index 000000000000..54b5b7880750
--- /dev/null
+++ b/drivers/rpmsg/apu_rpmsg.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright 2020 BayLibre SAS
+ */
+
+#ifndef __APU_RPMSG_H__
+#define __APU_RPMSG_H__
+
+/*
+ * Firmware request, must be aligned with the one defined in firmware.
+ * @id: Request id, used in the case of reply, to find the pending request
+ * @cmd: The command id to execute in the firmware
+ * @result: The result of the command executed on the firmware
+ * @size: The size of the data available in this request
+ * @count: The number of shared buffer
+ * @data: Contains the data attached with the request if size is greater than
+ *        zero, and the addresses of shared buffers if count is greater than
+ *        zero. Both the data and the shared buffer could be read and write
+ *        by the APU.
+ */
+struct  apu_dev_request {
+	u16 id;
+	u16 cmd;
+	u16 result;
+	u16 size_in;
+	u16 size_out;
+	u16 count;
+	u8 data[0];
+} __packed;
+
+#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
+#define APU_CTRL_SRC 1
+#define APU_CTRL_DST 1
+
+/* Vendor specific resource table entry */
+#define RSC_VENDOR_IOVA 128
+
+/*
+ * Firmware IOVA resource table entry
+ * Define a range of virtual device address that could mapped using the IOMMU.
+ * @da: Start virtual device address
+ * @len: Length of the virtual device address
+ * @name: name of the resource
+ */
+struct fw_rsc_iova {
+	u32 da;
+	u32 len;
+	u32 reserved;
+	u8 name[32];
+} __packed;
+
+#endif /* __APU_RPMSG_H__ */
diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
new file mode 100644
index 000000000000..81c9e4af9a94
--- /dev/null
+++ b/include/uapi/linux/apu_rpmsg.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright (c) 2020 BayLibre
+ */
+
+#ifndef _UAPI_RPMSG_APU_H_
+#define _UAPI_RPMSG_APU_H_
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/*
+ * Structure containing the APU request from userspace application
+ * @cmd: The id of the command to execute on the APU
+ * @result: The result of the command executed on the APU
+ * @size: The size of the data available in this request
+ * @count: The number of shared buffer
+ * @data: Contains the data attached with the request if size is greater than
+ *        zero, and the files descriptors of shared buffers if count is greater
+ *        than zero. Both the data and the shared buffer could be read and write
+ *        by the APU.
+ */
+struct apu_request {
+	__u16 cmd;
+	__u16 result;
+	__u16 size_in;
+	__u16 size_out;
+	__u16 count;
+	__u16 reserved;
+	__u8 data[0];
+};
+
+/* Send synchronous request to an APU */
+#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
+
+#endif
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 2/4] rpmsg: apu_rpmsg: Add support for async apu request
  2020-09-30 11:53 ` Alexandre Bailon
@ 2020-09-30 11:53   ` Alexandre Bailon
  -1 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, bjorn.andersson, sumit.semwal, christian.koenig,
	linux-kernel, linux-media, dri-devel, linaro-mm-sig, jstephan,
	stephane.leprovost, gpain, mturquette, Alexandre Bailon

From: Julien STEPHAN <jstephan@baylibre.com>

In order to improve performances and flexibility,
add support of async request.

Signed-off-by: Julien STEPHAN <jstephan@baylibre.com>
Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/rpmsg/apu_rpmsg.c      | 208 ++++++++++++++++++++++-----------
 include/uapi/linux/apu_rpmsg.h |   6 +-
 2 files changed, 144 insertions(+), 70 deletions(-)

diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
index 5131b8b8e1f2..e14597c467d7 100644
--- a/drivers/rpmsg/apu_rpmsg.c
+++ b/drivers/rpmsg/apu_rpmsg.c
@@ -34,11 +34,16 @@ struct rpmsg_apu {
 	struct iommu_domain *domain;
 	struct iova_domain *iovad;
 	int iova_limit_pfn;
+	wait_queue_head_t waitqueue;
+	u8 available_response;
+	spinlock_t ctx_lock;
+	struct list_head requests;
 };
 
 struct rpmsg_request {
-	struct completion completion;
+	u8 ready;
 	struct list_head node;
+	struct apu_buffer *buffer;
 	void *req;
 };
 
@@ -68,25 +73,35 @@ static dev_t rpmsg_major;
 static DEFINE_IDA(rpmsg_ctrl_ida);
 static DEFINE_IDA(rpmsg_minor_ida);
 static DEFINE_IDA(req_ida);
-static LIST_HEAD(requests);
 static struct apu_iova_domain *apu_iovad;
 
-static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
+
+static int apu_rpmsg_callback(struct rpmsg_device *rpdev, void *data, int count,
 			      void *priv, u32 addr)
 {
+	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
 	struct rpmsg_request *rpmsg_req;
 	struct apu_dev_request *hdr = data;
+	unsigned long flags;
 
-	list_for_each_entry(rpmsg_req, &requests, node) {
-		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
+	spin_lock_irqsave(&apu->ctx_lock, flags);
+	list_for_each_entry(rpmsg_req, &apu->requests, node) {
+		struct apu_request *tmp_hdr = rpmsg_req->req;
 
 		if (hdr->id == tmp_hdr->id) {
-			memcpy(rpmsg_req->req, data, count);
-			complete(&rpmsg_req->completion);
-
-			return 0;
+			rpmsg_req->ready = 1;
+			apu->available_response++;
+			tmp_hdr->result = hdr->result;
+			tmp_hdr->size_in = hdr->size_in;
+			tmp_hdr->size_out = hdr->size_out;
+			memcpy(tmp_hdr->data, hdr->data,
+			       hdr->size_in+hdr->size_out);
+
+			wake_up_interruptible(&apu->waitqueue);
+			break;
 		}
 	}
+	spin_unlock_irqrestore(&apu->ctx_lock, flags);
 
 	return 0;
 }
@@ -177,48 +192,6 @@ static void apu_device_memory_unmap(struct rpmsg_apu *apu,
 	dma_buf_put(buffer->dma_buf);
 }
 
-static int _apu_send_request(struct rpmsg_apu *apu,
-			     struct rpmsg_device *rpdev,
-			     struct apu_dev_request *req, int len)
-{
-
-	struct rpmsg_request *rpmsg_req;
-	int ret = 0;
-
-	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
-	if (req->id < 0)
-		return ret;
-
-	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
-	if (!rpmsg_req)
-		return -ENOMEM;
-
-	rpmsg_req->req = req;
-	init_completion(&rpmsg_req->completion);
-	list_add(&rpmsg_req->node, &requests);
-
-	ret = rpmsg_send(rpdev->ept, req, len);
-	if (ret)
-		goto free_req;
-
-	/* be careful with race here between timeout and callback*/
-	ret = wait_for_completion_timeout(&rpmsg_req->completion,
-					  msecs_to_jiffies(1000));
-	if (!ret)
-		ret = -ETIMEDOUT;
-	else
-		ret = 0;
-
-	ida_simple_remove(&req_ida, req->id);
-
-free_req:
-
-	list_del(&rpmsg_req->node);
-	kfree(rpmsg_req);
-
-	return ret;
-}
-
 static int apu_send_request(struct rpmsg_apu *apu,
 			    struct apu_request *req)
 {
@@ -226,6 +199,8 @@ static int apu_send_request(struct rpmsg_apu *apu,
 	struct rpmsg_device *rpdev = apu->rpdev;
 	struct apu_dev_request *dev_req;
 	struct apu_buffer *buffer;
+	struct rpmsg_request *rpmsg_req;
+	unsigned long flags;
 
 	int size = req->size_in + req->size_out +
 		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
@@ -257,24 +232,63 @@ static int apu_send_request(struct rpmsg_apu *apu,
 		dev_req_buffer_size[i] = buffer_size[i];
 	}
 
-	ret = _apu_send_request(apu, rpdev, dev_req, size);
+	ret = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
+	if (ret < 0)
+		goto err_free_memory;
+
+	dev_req->id = ret;
+
+	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
+	if (!rpmsg_req) {
+		ret =  -ENOMEM;
+		goto err_ida_remove;
+	}
 
+	req->id = dev_req->id;
+	rpmsg_req->req = req;
+	rpmsg_req->buffer = buffer;
+	spin_lock_irqsave(&apu->ctx_lock, flags);
+	list_add(&rpmsg_req->node, &apu->requests);
+	spin_unlock_irqrestore(&apu->ctx_lock, flags);
+
+	ret = rpmsg_send(rpdev->ept, dev_req, size);
+	if (ret < 0)
+		goto err;
+
+	kfree(dev_req);
+
+	return req->id;
+
+err:
+	list_del(&rpmsg_req->node);
+	kfree(rpmsg_req);
+	kfree(req);
+err_ida_remove:
+	ida_simple_remove(&req_ida, dev_req->id);
 err_free_memory:
 	for (i--; i >= 0; i--)
 		apu_device_memory_unmap(apu, &buffer[i]);
 
-	req->result = dev_req->result;
-	req->size_in = dev_req->size_in;
-	req->size_out = dev_req->size_out;
-	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
-	       sizeof(u32) * req->count);
-
 	kfree(buffer);
 	kfree(dev_req);
 
 	return ret;
 }
 
+unsigned int rpmsg_eptdev_poll(struct file *fp, struct poll_table_struct *wait)
+{
+	struct rpmsg_apu *apu = fp->private_data;
+	unsigned long flags;
+
+	poll_wait(fp, &apu->waitqueue, wait);
+	spin_lock_irqsave(&apu->ctx_lock, flags);
+	if (apu->available_response) {
+		spin_unlock_irqrestore(&apu->ctx_lock, flags);
+		return POLLIN;
+	}
+	spin_unlock_irqrestore(&apu->ctx_lock, flags);
+	return 0;
+}
 
 static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 			       unsigned long arg)
@@ -285,6 +299,11 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 	void __user *argp = (void __user *)arg;
 	int len;
 	int ret;
+	unsigned long flags;
+	struct rpmsg_request *rpmsg_req;
+	int i;
+
+	ret = 0;
 
 	switch (cmd) {
 	case APU_SEND_REQ_IOCTL:
@@ -306,24 +325,69 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 		}
 
 		ret = apu_send_request(apu, apu_req_full);
-		if (ret) {
-			kfree(apu_req_full);
-			return ret;
+
+		break;
+	case APU_GET_NEXT_AVAILABLE_IOCTL:
+		ret = -ENOMSG;
+		spin_lock_irqsave(&apu->ctx_lock, flags);
+		list_for_each_entry(rpmsg_req, &apu->requests, node) {
+			if (rpmsg_req->ready == 1) {
+				struct apu_request *req =
+					rpmsg_req->req;
+
+				ret = 0;
+				if (copy_to_user(argp, &req->id, sizeof(__u16)))
+					ret = -EFAULT;
+				break;
+			}
 		}
+		spin_unlock_irqrestore(&apu->ctx_lock, flags);
+		break;
+	case APU_GET_RESP:
+		/* Get the header */
+		if (!argp)
+			return -EINVAL;
 
-		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
-				 sizeof(u32) * apu_req_full->count +
-				 apu_req_full->size_in + apu_req_full->size_out))
-			ret = -EFAULT;
+		if (copy_from_user(&apu_req, argp,
+				   sizeof(apu_req)))
+			return -EFAULT;
 
-		kfree(apu_req_full);
-		return ret;
+		spin_lock_irqsave(&apu->ctx_lock, flags);
+		list_for_each_entry(rpmsg_req, &apu->requests, node) {
+			struct apu_request *req = rpmsg_req->req;
+
+			if ((apu_req.id == req->id) &&
+			    (rpmsg_req->ready == 1)) {
+				int req_len = sizeof(struct apu_request) +
+					req->size_in + req->size_out +
+					req->count * sizeof(u32)*2;
+				int apu_req_len = sizeof(struct apu_request) +
+					req->size_in + req->size_out +
+					req->count * sizeof(u32)*2;
+
+				len = min(req_len, apu_req_len);
+				if (copy_to_user(argp, req, len))
+					ret = -EFAULT;
+				apu->available_response--;
+				ida_simple_remove(&req_ida, req->id);
+				for (i = 0; i < req->count ; i++)
+					apu_device_memory_unmap(apu,
+							&rpmsg_req->buffer[i]);
+				list_del(&rpmsg_req->node);
+				kfree(rpmsg_req->buffer);
+				kfree(rpmsg_req->req);
+				kfree(rpmsg_req);
+				break;
+			}
+		}
+		spin_unlock_irqrestore(&apu->ctx_lock, flags);
 
+		break;
 	default:
-		return -EINVAL;
+		ret = -EINVAL;
 	}
 
-	return 0;
+	return ret;
 }
 
 static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
@@ -351,6 +415,7 @@ static const struct file_operations rpmsg_eptdev_fops = {
 	.release = rpmsg_eptdev_release,
 	.unlocked_ioctl = rpmsg_eptdev_ioctl,
 	.compat_ioctl = rpmsg_eptdev_ioctl,
+	.poll = rpmsg_eptdev_poll,
 };
 
 static void iova_domain_release(struct kref *ref)
@@ -512,6 +577,11 @@ static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
 	dev->id = ret;
 	dev_set_name(&apu->dev, "apu%d", ret);
 
+	init_waitqueue_head(&apu->waitqueue);
+	spin_lock_init(&apu->ctx_lock);
+	apu->available_response = 0;
+	INIT_LIST_HEAD(&apu->requests);
+
 	ret = cdev_add(&apu->cdev, dev->devt, 1);
 	if (ret)
 		goto free_ctrl_ida;
diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
index 81c9e4af9a94..f61207520254 100644
--- a/include/uapi/linux/apu_rpmsg.h
+++ b/include/uapi/linux/apu_rpmsg.h
@@ -21,6 +21,7 @@
  *        by the APU.
  */
 struct apu_request {
+	__u16 id;
 	__u16 cmd;
 	__u16 result;
 	__u16 size_in;
@@ -31,6 +32,9 @@ struct apu_request {
 };
 
 /* Send synchronous request to an APU */
-#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
+
+#define APU_SEND_REQ_IOCTL		_IOW(0xb7, 0x2, struct apu_request)
+#define APU_GET_NEXT_AVAILABLE_IOCTL	_IOR(0xb7, 0x3, __u16)
+#define APU_GET_RESP			_IOWR(0xb7, 0x4, struct apu_request)
 
 #endif
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 2/4] rpmsg: apu_rpmsg: Add support for async apu request
@ 2020-09-30 11:53   ` Alexandre Bailon
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, gpain, stephane.leprovost, jstephan, linux-kernel,
	dri-devel, linaro-mm-sig, mturquette, Alexandre Bailon,
	bjorn.andersson, christian.koenig, linux-media

From: Julien STEPHAN <jstephan@baylibre.com>

In order to improve performances and flexibility,
add support of async request.

Signed-off-by: Julien STEPHAN <jstephan@baylibre.com>
Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/rpmsg/apu_rpmsg.c      | 208 ++++++++++++++++++++++-----------
 include/uapi/linux/apu_rpmsg.h |   6 +-
 2 files changed, 144 insertions(+), 70 deletions(-)

diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
index 5131b8b8e1f2..e14597c467d7 100644
--- a/drivers/rpmsg/apu_rpmsg.c
+++ b/drivers/rpmsg/apu_rpmsg.c
@@ -34,11 +34,16 @@ struct rpmsg_apu {
 	struct iommu_domain *domain;
 	struct iova_domain *iovad;
 	int iova_limit_pfn;
+	wait_queue_head_t waitqueue;
+	u8 available_response;
+	spinlock_t ctx_lock;
+	struct list_head requests;
 };
 
 struct rpmsg_request {
-	struct completion completion;
+	u8 ready;
 	struct list_head node;
+	struct apu_buffer *buffer;
 	void *req;
 };
 
@@ -68,25 +73,35 @@ static dev_t rpmsg_major;
 static DEFINE_IDA(rpmsg_ctrl_ida);
 static DEFINE_IDA(rpmsg_minor_ida);
 static DEFINE_IDA(req_ida);
-static LIST_HEAD(requests);
 static struct apu_iova_domain *apu_iovad;
 
-static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
+
+static int apu_rpmsg_callback(struct rpmsg_device *rpdev, void *data, int count,
 			      void *priv, u32 addr)
 {
+	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
 	struct rpmsg_request *rpmsg_req;
 	struct apu_dev_request *hdr = data;
+	unsigned long flags;
 
-	list_for_each_entry(rpmsg_req, &requests, node) {
-		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
+	spin_lock_irqsave(&apu->ctx_lock, flags);
+	list_for_each_entry(rpmsg_req, &apu->requests, node) {
+		struct apu_request *tmp_hdr = rpmsg_req->req;
 
 		if (hdr->id == tmp_hdr->id) {
-			memcpy(rpmsg_req->req, data, count);
-			complete(&rpmsg_req->completion);
-
-			return 0;
+			rpmsg_req->ready = 1;
+			apu->available_response++;
+			tmp_hdr->result = hdr->result;
+			tmp_hdr->size_in = hdr->size_in;
+			tmp_hdr->size_out = hdr->size_out;
+			memcpy(tmp_hdr->data, hdr->data,
+			       hdr->size_in+hdr->size_out);
+
+			wake_up_interruptible(&apu->waitqueue);
+			break;
 		}
 	}
+	spin_unlock_irqrestore(&apu->ctx_lock, flags);
 
 	return 0;
 }
@@ -177,48 +192,6 @@ static void apu_device_memory_unmap(struct rpmsg_apu *apu,
 	dma_buf_put(buffer->dma_buf);
 }
 
-static int _apu_send_request(struct rpmsg_apu *apu,
-			     struct rpmsg_device *rpdev,
-			     struct apu_dev_request *req, int len)
-{
-
-	struct rpmsg_request *rpmsg_req;
-	int ret = 0;
-
-	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
-	if (req->id < 0)
-		return ret;
-
-	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
-	if (!rpmsg_req)
-		return -ENOMEM;
-
-	rpmsg_req->req = req;
-	init_completion(&rpmsg_req->completion);
-	list_add(&rpmsg_req->node, &requests);
-
-	ret = rpmsg_send(rpdev->ept, req, len);
-	if (ret)
-		goto free_req;
-
-	/* be careful with race here between timeout and callback*/
-	ret = wait_for_completion_timeout(&rpmsg_req->completion,
-					  msecs_to_jiffies(1000));
-	if (!ret)
-		ret = -ETIMEDOUT;
-	else
-		ret = 0;
-
-	ida_simple_remove(&req_ida, req->id);
-
-free_req:
-
-	list_del(&rpmsg_req->node);
-	kfree(rpmsg_req);
-
-	return ret;
-}
-
 static int apu_send_request(struct rpmsg_apu *apu,
 			    struct apu_request *req)
 {
@@ -226,6 +199,8 @@ static int apu_send_request(struct rpmsg_apu *apu,
 	struct rpmsg_device *rpdev = apu->rpdev;
 	struct apu_dev_request *dev_req;
 	struct apu_buffer *buffer;
+	struct rpmsg_request *rpmsg_req;
+	unsigned long flags;
 
 	int size = req->size_in + req->size_out +
 		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
@@ -257,24 +232,63 @@ static int apu_send_request(struct rpmsg_apu *apu,
 		dev_req_buffer_size[i] = buffer_size[i];
 	}
 
-	ret = _apu_send_request(apu, rpdev, dev_req, size);
+	ret = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
+	if (ret < 0)
+		goto err_free_memory;
+
+	dev_req->id = ret;
+
+	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
+	if (!rpmsg_req) {
+		ret =  -ENOMEM;
+		goto err_ida_remove;
+	}
 
+	req->id = dev_req->id;
+	rpmsg_req->req = req;
+	rpmsg_req->buffer = buffer;
+	spin_lock_irqsave(&apu->ctx_lock, flags);
+	list_add(&rpmsg_req->node, &apu->requests);
+	spin_unlock_irqrestore(&apu->ctx_lock, flags);
+
+	ret = rpmsg_send(rpdev->ept, dev_req, size);
+	if (ret < 0)
+		goto err;
+
+	kfree(dev_req);
+
+	return req->id;
+
+err:
+	list_del(&rpmsg_req->node);
+	kfree(rpmsg_req);
+	kfree(req);
+err_ida_remove:
+	ida_simple_remove(&req_ida, dev_req->id);
 err_free_memory:
 	for (i--; i >= 0; i--)
 		apu_device_memory_unmap(apu, &buffer[i]);
 
-	req->result = dev_req->result;
-	req->size_in = dev_req->size_in;
-	req->size_out = dev_req->size_out;
-	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
-	       sizeof(u32) * req->count);
-
 	kfree(buffer);
 	kfree(dev_req);
 
 	return ret;
 }
 
+unsigned int rpmsg_eptdev_poll(struct file *fp, struct poll_table_struct *wait)
+{
+	struct rpmsg_apu *apu = fp->private_data;
+	unsigned long flags;
+
+	poll_wait(fp, &apu->waitqueue, wait);
+	spin_lock_irqsave(&apu->ctx_lock, flags);
+	if (apu->available_response) {
+		spin_unlock_irqrestore(&apu->ctx_lock, flags);
+		return POLLIN;
+	}
+	spin_unlock_irqrestore(&apu->ctx_lock, flags);
+	return 0;
+}
 
 static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 			       unsigned long arg)
@@ -285,6 +299,11 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 	void __user *argp = (void __user *)arg;
 	int len;
 	int ret;
+	unsigned long flags;
+	struct rpmsg_request *rpmsg_req;
+	int i;
+
+	ret = 0;
 
 	switch (cmd) {
 	case APU_SEND_REQ_IOCTL:
@@ -306,24 +325,69 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 		}
 
 		ret = apu_send_request(apu, apu_req_full);
-		if (ret) {
-			kfree(apu_req_full);
-			return ret;
+
+		break;
+	case APU_GET_NEXT_AVAILABLE_IOCTL:
+		ret = -ENOMSG;
+		spin_lock_irqsave(&apu->ctx_lock, flags);
+		list_for_each_entry(rpmsg_req, &apu->requests, node) {
+			if (rpmsg_req->ready == 1) {
+				struct apu_request *req =
+					rpmsg_req->req;
+
+				ret = 0;
+				if (copy_to_user(argp, &req->id, sizeof(__u16)))
+					ret = -EFAULT;
+				break;
+			}
 		}
+		spin_unlock_irqrestore(&apu->ctx_lock, flags);
+		break;
+	case APU_GET_RESP:
+		/* Get the header */
+		if (!argp)
+			return -EINVAL;
 
-		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
-				 sizeof(u32) * apu_req_full->count +
-				 apu_req_full->size_in + apu_req_full->size_out))
-			ret = -EFAULT;
+		if (copy_from_user(&apu_req, argp,
+				   sizeof(apu_req)))
+			return -EFAULT;
 
-		kfree(apu_req_full);
-		return ret;
+		spin_lock_irqsave(&apu->ctx_lock, flags);
+		list_for_each_entry(rpmsg_req, &apu->requests, node) {
+			struct apu_request *req = rpmsg_req->req;
+
+			if ((apu_req.id == req->id) &&
+			    (rpmsg_req->ready == 1)) {
+				int req_len = sizeof(struct apu_request) +
+					req->size_in + req->size_out +
+					req->count * sizeof(u32)*2;
+				int apu_req_len = sizeof(struct apu_request) +
+					req->size_in + req->size_out +
+					req->count * sizeof(u32)*2;
+
+				len = min(req_len, apu_req_len);
+				if (copy_to_user(argp, req, len))
+					ret = -EFAULT;
+				apu->available_response--;
+				ida_simple_remove(&req_ida, req->id);
+				for (i = 0; i < req->count ; i++)
+					apu_device_memory_unmap(apu,
+							&rpmsg_req->buffer[i]);
+				list_del(&rpmsg_req->node);
+				kfree(rpmsg_req->buffer);
+				kfree(rpmsg_req->req);
+				kfree(rpmsg_req);
+				break;
+			}
+		}
+		spin_unlock_irqrestore(&apu->ctx_lock, flags);
 
+		break;
 	default:
-		return -EINVAL;
+		ret = -EINVAL;
 	}
 
-	return 0;
+	return ret;
 }
 
 static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
@@ -351,6 +415,7 @@ static const struct file_operations rpmsg_eptdev_fops = {
 	.release = rpmsg_eptdev_release,
 	.unlocked_ioctl = rpmsg_eptdev_ioctl,
 	.compat_ioctl = rpmsg_eptdev_ioctl,
+	.poll = rpmsg_eptdev_poll,
 };
 
 static void iova_domain_release(struct kref *ref)
@@ -512,6 +577,11 @@ static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
 	dev->id = ret;
 	dev_set_name(&apu->dev, "apu%d", ret);
 
+	init_waitqueue_head(&apu->waitqueue);
+	spin_lock_init(&apu->ctx_lock);
+	apu->available_response = 0;
+	INIT_LIST_HEAD(&apu->requests);
+
 	ret = cdev_add(&apu->cdev, dev->devt, 1);
 	if (ret)
 		goto free_ctrl_ida;
diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
index 81c9e4af9a94..f61207520254 100644
--- a/include/uapi/linux/apu_rpmsg.h
+++ b/include/uapi/linux/apu_rpmsg.h
@@ -21,6 +21,7 @@
  *        by the APU.
  */
 struct apu_request {
+	__u16 id;
 	__u16 cmd;
 	__u16 result;
 	__u16 size_in;
@@ -31,6 +32,9 @@ struct apu_request {
 };
 
 /* Send synchronous request to an APU */
-#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
+
+#define APU_SEND_REQ_IOCTL		_IOW(0xb7, 0x2, struct apu_request)
+#define APU_GET_NEXT_AVAILABLE_IOCTL	_IOR(0xb7, 0x3, __u16)
+#define APU_GET_RESP			_IOWR(0xb7, 0x4, struct apu_request)
 
 #endif
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 3/4] rpmsg: apu_rpmsg: update the way to store IOMMU mapping
  2020-09-30 11:53 ` Alexandre Bailon
@ 2020-09-30 11:53   ` Alexandre Bailon
  -1 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, bjorn.andersson, sumit.semwal, christian.koenig,
	linux-kernel, linux-media, dri-devel, linaro-mm-sig, jstephan,
	stephane.leprovost, gpain, mturquette, Alexandre Bailon

In order to reduce the memory mapping operations we are going to
add an IOCTL to request a mapping.
To make easier to add this new operation, use 2 lists to store the
mappings, one for the request and one for the device.

Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/rpmsg/apu_rpmsg.c | 104 +++++++++++++++++++++++++-------------
 1 file changed, 70 insertions(+), 34 deletions(-)

diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
index e14597c467d7..343bd08a859a 100644
--- a/drivers/rpmsg/apu_rpmsg.c
+++ b/drivers/rpmsg/apu_rpmsg.c
@@ -38,12 +38,14 @@ struct rpmsg_apu {
 	u8 available_response;
 	spinlock_t ctx_lock;
 	struct list_head requests;
+
+	struct list_head buffers;
 };
 
 struct rpmsg_request {
 	u8 ready;
 	struct list_head node;
-	struct apu_buffer *buffer;
+	struct list_head buffers;
 	void *req;
 };
 
@@ -53,6 +55,11 @@ struct apu_buffer {
 	struct dma_buf_attachment *attachment;
 	struct sg_table *sg_table;
 	u32 iova;
+
+	struct rpmsg_apu *apu;
+	struct list_head node;
+	struct list_head req_node;
+	struct kref refcount;
 };
 
 /*
@@ -106,23 +113,46 @@ static int apu_rpmsg_callback(struct rpmsg_device *rpdev, void *data, int count,
 	return 0;
 }
 
-static int apu_device_memory_map(struct rpmsg_apu *apu,
-				 struct apu_buffer *buffer)
+static struct apu_buffer *apu_device_memory_map(struct rpmsg_apu *apu,
+		uint32_t fd, struct rpmsg_request *rpmsg_req)
 {
 	struct rpmsg_device *rpdev = apu->rpdev;
+	struct apu_buffer *buffer;
 	phys_addr_t phys;
 	int total_buf_space;
 	int iova_pfn;
 	int ret;
 
-	if (!buffer->fd)
-		return 0;
+	if (!fd)
+		return NULL;
+
+	list_for_each_entry(buffer, &apu->buffers, node) {
+		if (buffer->fd == fd) {
+			kref_get(&buffer->refcount);
+			if (rpmsg_req)
+				list_add(&buffer->req_node,
+					 &rpmsg_req->buffers);
+
+			return buffer;
+		}
+	}
+
+	buffer = kmalloc(sizeof(*buffer), GFP_KERNEL);
+	if (!buffer)
+		return ERR_PTR(-ENOMEM);
+
+	kref_init(&buffer->refcount);
+	buffer->fd = fd;
+	buffer->apu = apu;
+	INIT_LIST_HEAD(&buffer->req_node);
+	INIT_LIST_HEAD(&buffer->node);
 
 	buffer->dma_buf = dma_buf_get(buffer->fd);
 	if (IS_ERR(buffer->dma_buf)) {
 		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
 			PTR_ERR(buffer->dma_buf));
-		return PTR_ERR(buffer->dma_buf);
+		ret = PTR_ERR(buffer->dma_buf);
+		goto err_free_buffer;
 	}
 
 	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
@@ -158,7 +188,9 @@ static int apu_device_memory_map(struct rpmsg_apu *apu,
 		goto err_free_iova;
 	}
 
-	return 0;
+	list_add(&buffer->node, &apu->buffers);
+
+	return buffer;
 
 err_free_iova:
 	free_iova(apu->iovad, iova_pfn);
@@ -170,13 +202,17 @@ static int apu_device_memory_map(struct rpmsg_apu *apu,
 	dma_buf_detach(buffer->dma_buf, buffer->attachment);
 err_dma_buf_put:
 	dma_buf_put(buffer->dma_buf);
+err_free_buffer:
+	kfree(buffer);
 
-	return ret;
+	return ERR_PTR(ret);
 }
 
-static void apu_device_memory_unmap(struct rpmsg_apu *apu,
-				    struct apu_buffer *buffer)
+static void apu_device_memory_unmap(struct kref *ref)
 {
+	struct apu_buffer *buffer = container_of(ref, struct apu_buffer,
+						 refcount);
+	struct rpmsg_apu *apu = buffer->apu;
 	int total_buf_space;
 
 	if (!buffer->fd)
@@ -190,6 +226,8 @@ static void apu_device_memory_unmap(struct rpmsg_apu *apu,
 				 DMA_BIDIRECTIONAL);
 	dma_buf_detach(buffer->dma_buf, buffer->attachment);
 	dma_buf_put(buffer->dma_buf);
+	list_del(&buffer->node);
+	kfree(buffer);
 }
 
 static int apu_send_request(struct rpmsg_apu *apu,
@@ -198,7 +236,7 @@ static int apu_send_request(struct rpmsg_apu *apu,
 	int ret;
 	struct rpmsg_device *rpdev = apu->rpdev;
 	struct apu_dev_request *dev_req;
-	struct apu_buffer *buffer;
+	struct apu_buffer *buffer, *tmp;
 	struct rpmsg_request *rpmsg_req;
 	unsigned long flags;
 
@@ -222,14 +260,21 @@ static int apu_send_request(struct rpmsg_apu *apu,
 	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
 	memcpy(dev_req->data, req->data, req->size_in);
 
-	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
+	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
+	if (!rpmsg_req)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&rpmsg_req->buffers);
 	for (i = 0; i < req->count; i++) {
-		buffer[i].fd = fd[i];
-		ret = apu_device_memory_map(apu, &buffer[i]);
-		if (ret)
+		buffer = apu_device_memory_map(apu, fd[i], rpmsg_req);
+		if (IS_ERR(buffer)) {
+			ret = PTR_ERR(buffer);
 			goto err_free_memory;
-		dev_req_da[i] = buffer[i].iova;
+		}
+
+		dev_req_da[i] = buffer->iova;
 		dev_req_buffer_size[i] = buffer_size[i];
+		list_add(&buffer->req_node, &rpmsg_req->buffers);
 	}
 
 	ret = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
@@ -238,15 +283,8 @@ static int apu_send_request(struct rpmsg_apu *apu,
 
 	dev_req->id = ret;
 
-	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
-	if (!rpmsg_req) {
-		ret =  -ENOMEM;
-		goto err_ida_remove;
-	}
-
 	req->id = dev_req->id;
 	rpmsg_req->req = req;
-	rpmsg_req->buffer = buffer;
 	spin_lock_irqsave(&apu->ctx_lock, flags);
 	list_add(&rpmsg_req->node, &apu->requests);
 	spin_unlock_irqrestore(&apu->ctx_lock, flags);
@@ -261,15 +299,12 @@ static int apu_send_request(struct rpmsg_apu *apu,
 
 err:
 	list_del(&rpmsg_req->node);
-	kfree(rpmsg_req);
 	kfree(req);
-err_ida_remove:
 	ida_simple_remove(&req_ida, dev_req->id);
 err_free_memory:
-	for (i--; i >= 0; i--)
-		apu_device_memory_unmap(apu, &buffer[i]);
-
-	kfree(buffer);
+	list_for_each_entry_safe(buffer, tmp, &rpmsg_req->buffers, req_node)
+		kref_put(&buffer->refcount, apu_device_memory_unmap);
+	kfree(rpmsg_req);
 	kfree(dev_req);
 
 	return ret;
@@ -296,12 +331,12 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 	struct rpmsg_apu *apu = fp->private_data;
 	struct apu_request apu_req;
 	struct apu_request *apu_req_full;
+	struct apu_buffer *buffer, *tmp;
 	void __user *argp = (void __user *)arg;
 	int len;
 	int ret;
 	unsigned long flags;
 	struct rpmsg_request *rpmsg_req;
-	int i;
 
 	ret = 0;
 
@@ -370,11 +405,11 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 					ret = -EFAULT;
 				apu->available_response--;
 				ida_simple_remove(&req_ida, req->id);
-				for (i = 0; i < req->count ; i++)
-					apu_device_memory_unmap(apu,
-							&rpmsg_req->buffer[i]);
+				list_for_each_entry_safe(buffer, tmp, &rpmsg_req->buffers, req_node) {
+					kref_put(&buffer->refcount, apu_device_memory_unmap);
+					list_del(&buffer->req_node);
+				}
 				list_del(&rpmsg_req->node);
-				kfree(rpmsg_req->buffer);
 				kfree(rpmsg_req->req);
 				kfree(rpmsg_req);
 				break;
@@ -554,6 +589,7 @@ static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
 	if (!apu)
 		return -ENOMEM;
 	apu->rpdev = rpdev;
+	INIT_LIST_HEAD(&apu->buffers);
 
 	apu->rproc = apu_get_rproc(rpdev);
 	if (IS_ERR_OR_NULL(apu->rproc))
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 3/4] rpmsg: apu_rpmsg: update the way to store IOMMU mapping
@ 2020-09-30 11:53   ` Alexandre Bailon
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, gpain, stephane.leprovost, jstephan, linux-kernel,
	dri-devel, linaro-mm-sig, mturquette, Alexandre Bailon,
	bjorn.andersson, christian.koenig, linux-media

In order to reduce the memory mapping operations we are going to
add an IOCTL to request a mapping.
To make easier to add this new operation, use 2 lists to store the
mappings, one for the request and one for the device.

Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/rpmsg/apu_rpmsg.c | 104 +++++++++++++++++++++++++-------------
 1 file changed, 70 insertions(+), 34 deletions(-)

diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
index e14597c467d7..343bd08a859a 100644
--- a/drivers/rpmsg/apu_rpmsg.c
+++ b/drivers/rpmsg/apu_rpmsg.c
@@ -38,12 +38,14 @@ struct rpmsg_apu {
 	u8 available_response;
 	spinlock_t ctx_lock;
 	struct list_head requests;
+
+	struct list_head buffers;
 };
 
 struct rpmsg_request {
 	u8 ready;
 	struct list_head node;
-	struct apu_buffer *buffer;
+	struct list_head buffers;
 	void *req;
 };
 
@@ -53,6 +55,11 @@ struct apu_buffer {
 	struct dma_buf_attachment *attachment;
 	struct sg_table *sg_table;
 	u32 iova;
+
+	struct rpmsg_apu *apu;
+	struct list_head node;
+	struct list_head req_node;
+	struct kref refcount;
 };
 
 /*
@@ -106,23 +113,46 @@ static int apu_rpmsg_callback(struct rpmsg_device *rpdev, void *data, int count,
 	return 0;
 }
 
-static int apu_device_memory_map(struct rpmsg_apu *apu,
-				 struct apu_buffer *buffer)
+static struct apu_buffer *apu_device_memory_map(struct rpmsg_apu *apu,
+		uint32_t fd, struct rpmsg_request *rpmsg_req)
 {
 	struct rpmsg_device *rpdev = apu->rpdev;
+	struct apu_buffer *buffer;
 	phys_addr_t phys;
 	int total_buf_space;
 	int iova_pfn;
 	int ret;
 
-	if (!buffer->fd)
-		return 0;
+	if (!fd)
+		return NULL;
+
+	list_for_each_entry(buffer, &apu->buffers, node) {
+		if (buffer->fd == fd) {
+			kref_get(&buffer->refcount);
+			if (rpmsg_req)
+				list_add(&buffer->req_node,
+					 &rpmsg_req->buffers);
+
+			return buffer;
+		}
+	}
+
+	buffer = kmalloc(sizeof(*buffer), GFP_KERNEL);
+	if (!buffer)
+		return ERR_PTR(-ENOMEM);
+
+	kref_init(&buffer->refcount);
+	buffer->fd = fd;
+	buffer->apu = apu;
+	INIT_LIST_HEAD(&buffer->req_node);
+	INIT_LIST_HEAD(&buffer->node);
 
 	buffer->dma_buf = dma_buf_get(buffer->fd);
 	if (IS_ERR(buffer->dma_buf)) {
 		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
 			PTR_ERR(buffer->dma_buf));
-		return PTR_ERR(buffer->dma_buf);
+		ret = PTR_ERR(buffer->dma_buf);
+		goto err_free_buffer;
 	}
 
 	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
@@ -158,7 +188,9 @@ static int apu_device_memory_map(struct rpmsg_apu *apu,
 		goto err_free_iova;
 	}
 
-	return 0;
+	list_add(&buffer->node, &apu->buffers);
+
+	return buffer;
 
 err_free_iova:
 	free_iova(apu->iovad, iova_pfn);
@@ -170,13 +202,17 @@ static int apu_device_memory_map(struct rpmsg_apu *apu,
 	dma_buf_detach(buffer->dma_buf, buffer->attachment);
 err_dma_buf_put:
 	dma_buf_put(buffer->dma_buf);
+err_free_buffer:
+	kfree(buffer);
 
-	return ret;
+	return ERR_PTR(ret);
 }
 
-static void apu_device_memory_unmap(struct rpmsg_apu *apu,
-				    struct apu_buffer *buffer)
+static void apu_device_memory_unmap(struct kref *ref)
 {
+	struct apu_buffer *buffer = container_of(ref, struct apu_buffer,
+						 refcount);
+	struct rpmsg_apu *apu = buffer->apu;
 	int total_buf_space;
 
 	if (!buffer->fd)
@@ -190,6 +226,8 @@ static void apu_device_memory_unmap(struct rpmsg_apu *apu,
 				 DMA_BIDIRECTIONAL);
 	dma_buf_detach(buffer->dma_buf, buffer->attachment);
 	dma_buf_put(buffer->dma_buf);
+	list_del(&buffer->node);
+	kfree(buffer);
 }
 
 static int apu_send_request(struct rpmsg_apu *apu,
@@ -198,7 +236,7 @@ static int apu_send_request(struct rpmsg_apu *apu,
 	int ret;
 	struct rpmsg_device *rpdev = apu->rpdev;
 	struct apu_dev_request *dev_req;
-	struct apu_buffer *buffer;
+	struct apu_buffer *buffer, *tmp;
 	struct rpmsg_request *rpmsg_req;
 	unsigned long flags;
 
@@ -222,14 +260,21 @@ static int apu_send_request(struct rpmsg_apu *apu,
 	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
 	memcpy(dev_req->data, req->data, req->size_in);
 
-	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
+	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
+	if (!rpmsg_req)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&rpmsg_req->buffers);
 	for (i = 0; i < req->count; i++) {
-		buffer[i].fd = fd[i];
-		ret = apu_device_memory_map(apu, &buffer[i]);
-		if (ret)
+		buffer = apu_device_memory_map(apu, fd[i], rpmsg_req);
+		if (IS_ERR(buffer)) {
+			ret = PTR_ERR(buffer);
 			goto err_free_memory;
-		dev_req_da[i] = buffer[i].iova;
+		}
+
+		dev_req_da[i] = buffer->iova;
 		dev_req_buffer_size[i] = buffer_size[i];
+		list_add(&buffer->req_node, &rpmsg_req->buffers);
 	}
 
 	ret = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
@@ -238,15 +283,8 @@ static int apu_send_request(struct rpmsg_apu *apu,
 
 	dev_req->id = ret;
 
-	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
-	if (!rpmsg_req) {
-		ret =  -ENOMEM;
-		goto err_ida_remove;
-	}
-
 	req->id = dev_req->id;
 	rpmsg_req->req = req;
-	rpmsg_req->buffer = buffer;
 	spin_lock_irqsave(&apu->ctx_lock, flags);
 	list_add(&rpmsg_req->node, &apu->requests);
 	spin_unlock_irqrestore(&apu->ctx_lock, flags);
@@ -261,15 +299,12 @@ static int apu_send_request(struct rpmsg_apu *apu,
 
 err:
 	list_del(&rpmsg_req->node);
-	kfree(rpmsg_req);
 	kfree(req);
-err_ida_remove:
 	ida_simple_remove(&req_ida, dev_req->id);
 err_free_memory:
-	for (i--; i >= 0; i--)
-		apu_device_memory_unmap(apu, &buffer[i]);
-
-	kfree(buffer);
+	list_for_each_entry_safe(buffer, tmp, &rpmsg_req->buffers, req_node)
+		kref_put(&buffer->refcount, apu_device_memory_unmap);
+	kfree(rpmsg_req);
 	kfree(dev_req);
 
 	return ret;
@@ -296,12 +331,12 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 	struct rpmsg_apu *apu = fp->private_data;
 	struct apu_request apu_req;
 	struct apu_request *apu_req_full;
+	struct apu_buffer *buffer, *tmp;
 	void __user *argp = (void __user *)arg;
 	int len;
 	int ret;
 	unsigned long flags;
 	struct rpmsg_request *rpmsg_req;
-	int i;
 
 	ret = 0;
 
@@ -370,11 +405,11 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 					ret = -EFAULT;
 				apu->available_response--;
 				ida_simple_remove(&req_ida, req->id);
-				for (i = 0; i < req->count ; i++)
-					apu_device_memory_unmap(apu,
-							&rpmsg_req->buffer[i]);
+				list_for_each_entry_safe(buffer, tmp, &rpmsg_req->buffers, req_node) {
+					kref_put(&buffer->refcount, apu_device_memory_unmap);
+					list_del(&buffer->req_node);
+				}
 				list_del(&rpmsg_req->node);
-				kfree(rpmsg_req->buffer);
 				kfree(rpmsg_req->req);
 				kfree(rpmsg_req);
 				break;
@@ -554,6 +589,7 @@ static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
 	if (!apu)
 		return -ENOMEM;
 	apu->rpdev = rpdev;
+	INIT_LIST_HEAD(&apu->buffers);
 
 	apu->rproc = apu_get_rproc(rpdev);
 	if (IS_ERR_OR_NULL(apu->rproc))
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 4/4] rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
  2020-09-30 11:53 ` Alexandre Bailon
@ 2020-09-30 11:53   ` Alexandre Bailon
  -1 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, bjorn.andersson, sumit.semwal, christian.koenig,
	linux-kernel, linux-media, dri-devel, linaro-mm-sig, jstephan,
	stephane.leprovost, gpain, mturquette, Alexandre Bailon

Currently, the kernel is automatically doing an IOMMU memory mapping.
But we want to do it automatically for two reasons:
- to reduce the overhead of each APU operation
- to get the device address and use it as input for an operation
This adds 2 IOCTL to manually IOMMU map and unmap memory.

Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/rpmsg/apu_rpmsg.c      | 52 ++++++++++++++++++++++++++++++----
 include/uapi/linux/apu_rpmsg.h |  7 +++++
 2 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
index 343bd08a859a..4c064feddf5a 100644
--- a/drivers/rpmsg/apu_rpmsg.c
+++ b/drivers/rpmsg/apu_rpmsg.c
@@ -114,7 +114,7 @@ static int apu_rpmsg_callback(struct rpmsg_device *rpdev, void *data, int count,
 }
 
 static struct apu_buffer *apu_device_memory_map(struct rpmsg_apu *apu,
-		uint32_t fd, struct rpmsg_request *rpmsg_req)
+						uint32_t fd)
 {
 	struct rpmsg_device *rpdev = apu->rpdev;
 	struct apu_buffer *buffer;
@@ -129,10 +129,6 @@ static struct apu_buffer *apu_device_memory_map(struct rpmsg_apu *apu,
 	list_for_each_entry(buffer, &apu->buffers, node) {
 		if (buffer->fd == fd) {
 			kref_get(&buffer->refcount);
-			if (rpmsg_req)
-				list_add(&buffer->req_node,
-					 &rpmsg_req->buffers);
-
 			return buffer;
 		}
 	}
@@ -230,6 +226,44 @@ static void apu_device_memory_unmap(struct kref *ref)
 	kfree(buffer);
 }
 
+static int apu_iommu_mmap_ioctl(struct rpmsg_apu *apu, void __user *argp)
+{
+	struct apu_iommu_mmap apu_iommu_mmap;
+	struct apu_buffer *buffer;
+	int ret;
+
+	if (copy_from_user(&apu_iommu_mmap, argp, sizeof(apu_iommu_mmap)))
+		return -EFAULT;
+
+	buffer = apu_device_memory_map(apu, apu_iommu_mmap.fd);
+	if (!buffer)
+		return -ENOMEM;
+
+	apu_iommu_mmap.da = buffer->iova;
+	if (copy_to_user(argp, &apu_iommu_mmap, sizeof(apu_iommu_mmap)))
+		ret = -EFAULT;
+
+	return 0;
+}
+
+static int apu_iommu_munmap_ioctl(struct rpmsg_apu *apu, void __user *argp)
+{
+	u32 fd;
+	struct apu_buffer *buffer, *tmp;
+
+	if (copy_from_user(&fd, argp, sizeof(fd)))
+		return -EFAULT;
+
+	list_for_each_entry_safe(buffer, tmp, &apu->buffers, node) {
+		if (buffer->fd == fd) {
+			kref_put(&buffer->refcount, apu_device_memory_unmap);
+			return 0;
+		}
+	}
+
+	return -EINVAL;
+}
+
 static int apu_send_request(struct rpmsg_apu *apu,
 			    struct apu_request *req)
 {
@@ -266,7 +300,7 @@ static int apu_send_request(struct rpmsg_apu *apu,
 
 	INIT_LIST_HEAD(&rpmsg_req->buffers);
 	for (i = 0; i < req->count; i++) {
-		buffer = apu_device_memory_map(apu, fd[i], rpmsg_req);
+		buffer = apu_device_memory_map(apu, fd[i]);
 		if (IS_ERR(buffer)) {
 			ret = PTR_ERR(buffer);
 			goto err_free_memory;
@@ -417,6 +451,12 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 		}
 		spin_unlock_irqrestore(&apu->ctx_lock, flags);
 
+		break;
+	case APU_IOMMU_MMAP:
+		ret = apu_iommu_mmap_ioctl(apu, argp);
+		break;
+	case APU_IOMMU_MUNMAP:
+		ret = apu_iommu_munmap_ioctl(apu, argp);
 		break;
 	default:
 		ret = -EINVAL;
diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
index f61207520254..e9b841dcbcb4 100644
--- a/include/uapi/linux/apu_rpmsg.h
+++ b/include/uapi/linux/apu_rpmsg.h
@@ -31,10 +31,17 @@ struct apu_request {
 	__u8 data[0];
 };
 
+struct apu_iommu_mmap {
+	__u32 fd;
+	__u32 da;
+};
+
 /* Send synchronous request to an APU */
 
 #define APU_SEND_REQ_IOCTL		_IOW(0xb7, 0x2, struct apu_request)
 #define APU_GET_NEXT_AVAILABLE_IOCTL	_IOR(0xb7, 0x3, __u16)
 #define APU_GET_RESP			_IOWR(0xb7, 0x4, struct apu_request)
+#define APU_IOMMU_MMAP			_IOWR(0xb7, 0x5, struct apu_iommu_mmap)
+#define APU_IOMMU_MUNMAP		_IOWR(0xb7, 0x6, __u32)
 
 #endif
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 4/4] rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
@ 2020-09-30 11:53   ` Alexandre Bailon
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-09-30 11:53 UTC (permalink / raw)
  To: linux-remoteproc
  Cc: ohad, gpain, stephane.leprovost, jstephan, linux-kernel,
	dri-devel, linaro-mm-sig, mturquette, Alexandre Bailon,
	bjorn.andersson, christian.koenig, linux-media

Currently, the kernel is automatically doing an IOMMU memory mapping.
But we want to do it automatically for two reasons:
- to reduce the overhead of each APU operation
- to get the device address and use it as input for an operation
This adds 2 IOCTL to manually IOMMU map and unmap memory.

Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/rpmsg/apu_rpmsg.c      | 52 ++++++++++++++++++++++++++++++----
 include/uapi/linux/apu_rpmsg.h |  7 +++++
 2 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
index 343bd08a859a..4c064feddf5a 100644
--- a/drivers/rpmsg/apu_rpmsg.c
+++ b/drivers/rpmsg/apu_rpmsg.c
@@ -114,7 +114,7 @@ static int apu_rpmsg_callback(struct rpmsg_device *rpdev, void *data, int count,
 }
 
 static struct apu_buffer *apu_device_memory_map(struct rpmsg_apu *apu,
-		uint32_t fd, struct rpmsg_request *rpmsg_req)
+						uint32_t fd)
 {
 	struct rpmsg_device *rpdev = apu->rpdev;
 	struct apu_buffer *buffer;
@@ -129,10 +129,6 @@ static struct apu_buffer *apu_device_memory_map(struct rpmsg_apu *apu,
 	list_for_each_entry(buffer, &apu->buffers, node) {
 		if (buffer->fd == fd) {
 			kref_get(&buffer->refcount);
-			if (rpmsg_req)
-				list_add(&buffer->req_node,
-					 &rpmsg_req->buffers);
-
 			return buffer;
 		}
 	}
@@ -230,6 +226,44 @@ static void apu_device_memory_unmap(struct kref *ref)
 	kfree(buffer);
 }
 
+static int apu_iommu_mmap_ioctl(struct rpmsg_apu *apu, void __user *argp)
+{
+	struct apu_iommu_mmap apu_iommu_mmap;
+	struct apu_buffer *buffer;
+	int ret;
+
+	if (copy_from_user(&apu_iommu_mmap, argp, sizeof(apu_iommu_mmap)))
+		return -EFAULT;
+
+	buffer = apu_device_memory_map(apu, apu_iommu_mmap.fd);
+	if (!buffer)
+		return -ENOMEM;
+
+	apu_iommu_mmap.da = buffer->iova;
+	if (copy_to_user(argp, &apu_iommu_mmap, sizeof(apu_iommu_mmap)))
+		ret = -EFAULT;
+
+	return 0;
+}
+
+static int apu_iommu_munmap_ioctl(struct rpmsg_apu *apu, void __user *argp)
+{
+	u32 fd;
+	struct apu_buffer *buffer, *tmp;
+
+	if (copy_from_user(&fd, argp, sizeof(fd)))
+		return -EFAULT;
+
+	list_for_each_entry_safe(buffer, tmp, &apu->buffers, node) {
+		if (buffer->fd == fd) {
+			kref_put(&buffer->refcount, apu_device_memory_unmap);
+			return 0;
+		}
+	}
+
+	return -EINVAL;
+}
+
 static int apu_send_request(struct rpmsg_apu *apu,
 			    struct apu_request *req)
 {
@@ -266,7 +300,7 @@ static int apu_send_request(struct rpmsg_apu *apu,
 
 	INIT_LIST_HEAD(&rpmsg_req->buffers);
 	for (i = 0; i < req->count; i++) {
-		buffer = apu_device_memory_map(apu, fd[i], rpmsg_req);
+		buffer = apu_device_memory_map(apu, fd[i]);
 		if (IS_ERR(buffer)) {
 			ret = PTR_ERR(buffer);
 			goto err_free_memory;
@@ -417,6 +451,12 @@ static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
 		}
 		spin_unlock_irqrestore(&apu->ctx_lock, flags);
 
+		break;
+	case APU_IOMMU_MMAP:
+		ret = apu_iommu_mmap_ioctl(apu, argp);
+		break;
+	case APU_IOMMU_MUNMAP:
+		ret = apu_iommu_munmap_ioctl(apu, argp);
 		break;
 	default:
 		ret = -EINVAL;
diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
index f61207520254..e9b841dcbcb4 100644
--- a/include/uapi/linux/apu_rpmsg.h
+++ b/include/uapi/linux/apu_rpmsg.h
@@ -31,10 +31,17 @@ struct apu_request {
 	__u8 data[0];
 };
 
+struct apu_iommu_mmap {
+	__u32 fd;
+	__u32 da;
+};
+
 /* Send synchronous request to an APU */
 
 #define APU_SEND_REQ_IOCTL		_IOW(0xb7, 0x2, struct apu_request)
 #define APU_GET_NEXT_AVAILABLE_IOCTL	_IOR(0xb7, 0x3, __u16)
 #define APU_GET_RESP			_IOWR(0xb7, 0x4, struct apu_request)
+#define APU_IOMMU_MMAP			_IOWR(0xb7, 0x5, struct apu_iommu_mmap)
+#define APU_IOMMU_MUNMAP		_IOWR(0xb7, 0x6, __u32)
 
 #endif
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU)
  2020-09-30 11:53 ` Alexandre Bailon
@ 2020-10-01  8:48   ` Daniel Vetter
  -1 siblings, 0 replies; 22+ messages in thread
From: Daniel Vetter @ 2020-10-01  8:48 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: linux-remoteproc, ohad, gpain, stephane.leprovost, jstephan,
	linux-kernel, dri-devel, linaro-mm-sig, mturquette,
	bjorn.andersson, christian.koenig, linux-media

On Wed, Sep 30, 2020 at 01:53:46PM +0200, Alexandre Bailon wrote:
> This adds a RPMsg driver that implements communication between the CPU and an
> APU.
> This uses VirtIO buffer to exchange messages but for sharing data, this uses
> a dmabuf, mapped to be shared between CPU (userspace) and APU.
> The driver is relatively generic, and should work with any SoC implementing
> hardware accelerator for AI if they use support remoteproc and VirtIO.
> 
> For the people interested by the firmware or userspace library,
> the sources are available here:
> https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu

Since this has open userspace (from a very cursory look), and smells very
much like an acceleration driver, and seems to use dma-buf for memory
management: Why is this not just a drm driver?
-Daniel

> 
> Alexandre Bailon (3):
>   Add a RPMSG driver for the APU in the mt8183
>   rpmsg: apu_rpmsg: update the way to store IOMMU mapping
>   rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
> 
> Julien STEPHAN (1):
>   rpmsg: apu_rpmsg: Add support for async apu request
> 
>  drivers/rpmsg/Kconfig          |   9 +
>  drivers/rpmsg/Makefile         |   1 +
>  drivers/rpmsg/apu_rpmsg.c      | 752 +++++++++++++++++++++++++++++++++
>  drivers/rpmsg/apu_rpmsg.h      |  52 +++
>  include/uapi/linux/apu_rpmsg.h |  47 +++
>  5 files changed, 861 insertions(+)
>  create mode 100644 drivers/rpmsg/apu_rpmsg.c
>  create mode 100644 drivers/rpmsg/apu_rpmsg.h
>  create mode 100644 include/uapi/linux/apu_rpmsg.h
> 
> -- 
> 2.26.2
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU)
@ 2020-10-01  8:48   ` Daniel Vetter
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel Vetter @ 2020-10-01  8:48 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: ohad, linaro-mm-sig, stephane.leprovost, christian.koenig,
	mturquette, linux-remoteproc, linux-kernel, dri-devel,
	bjorn.andersson, gpain, jstephan, linux-media

On Wed, Sep 30, 2020 at 01:53:46PM +0200, Alexandre Bailon wrote:
> This adds a RPMsg driver that implements communication between the CPU and an
> APU.
> This uses VirtIO buffer to exchange messages but for sharing data, this uses
> a dmabuf, mapped to be shared between CPU (userspace) and APU.
> The driver is relatively generic, and should work with any SoC implementing
> hardware accelerator for AI if they use support remoteproc and VirtIO.
> 
> For the people interested by the firmware or userspace library,
> the sources are available here:
> https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu

Since this has open userspace (from a very cursory look), and smells very
much like an acceleration driver, and seems to use dma-buf for memory
management: Why is this not just a drm driver?
-Daniel

> 
> Alexandre Bailon (3):
>   Add a RPMSG driver for the APU in the mt8183
>   rpmsg: apu_rpmsg: update the way to store IOMMU mapping
>   rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
> 
> Julien STEPHAN (1):
>   rpmsg: apu_rpmsg: Add support for async apu request
> 
>  drivers/rpmsg/Kconfig          |   9 +
>  drivers/rpmsg/Makefile         |   1 +
>  drivers/rpmsg/apu_rpmsg.c      | 752 +++++++++++++++++++++++++++++++++
>  drivers/rpmsg/apu_rpmsg.h      |  52 +++
>  include/uapi/linux/apu_rpmsg.h |  47 +++
>  5 files changed, 861 insertions(+)
>  create mode 100644 drivers/rpmsg/apu_rpmsg.c
>  create mode 100644 drivers/rpmsg/apu_rpmsg.h
>  create mode 100644 include/uapi/linux/apu_rpmsg.h
> 
> -- 
> 2.26.2
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU)
  2020-10-01  8:48   ` Daniel Vetter
@ 2020-10-01 17:28     ` Alexandre Bailon
  -1 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-10-01 17:28 UTC (permalink / raw)
  To: linux-remoteproc, ohad, gpain, stephane.leprovost, jstephan,
	linux-kernel, dri-devel, linaro-mm-sig, mturquette,
	bjorn.andersson, christian.koenig, linux-media

Hi Daniel,

On 10/1/20 10:48 AM, Daniel Vetter wrote:
> On Wed, Sep 30, 2020 at 01:53:46PM +0200, Alexandre Bailon wrote:
>> This adds a RPMsg driver that implements communication between the CPU and an
>> APU.
>> This uses VirtIO buffer to exchange messages but for sharing data, this uses
>> a dmabuf, mapped to be shared between CPU (userspace) and APU.
>> The driver is relatively generic, and should work with any SoC implementing
>> hardware accelerator for AI if they use support remoteproc and VirtIO.
>>
>> For the people interested by the firmware or userspace library,
>> the sources are available here:
>> https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu
> Since this has open userspace (from a very cursory look), and smells very
> much like an acceleration driver, and seems to use dma-buf for memory
> management: Why is this not just a drm driver?

I have never though to DRM since for me it was only a RPMsg driver.
I don't know well DRM. Could you tell me how you would do it so I could 
have a look ?

Thanks,
Alexandre

> -Daniel
>
>> Alexandre Bailon (3):
>>    Add a RPMSG driver for the APU in the mt8183
>>    rpmsg: apu_rpmsg: update the way to store IOMMU mapping
>>    rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
>>
>> Julien STEPHAN (1):
>>    rpmsg: apu_rpmsg: Add support for async apu request
>>
>>   drivers/rpmsg/Kconfig          |   9 +
>>   drivers/rpmsg/Makefile         |   1 +
>>   drivers/rpmsg/apu_rpmsg.c      | 752 +++++++++++++++++++++++++++++++++
>>   drivers/rpmsg/apu_rpmsg.h      |  52 +++
>>   include/uapi/linux/apu_rpmsg.h |  47 +++
>>   5 files changed, 861 insertions(+)
>>   create mode 100644 drivers/rpmsg/apu_rpmsg.c
>>   create mode 100644 drivers/rpmsg/apu_rpmsg.h
>>   create mode 100644 include/uapi/linux/apu_rpmsg.h
>>
>> -- 
>> 2.26.2
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU)
@ 2020-10-01 17:28     ` Alexandre Bailon
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2020-10-01 17:28 UTC (permalink / raw)
  To: linux-remoteproc, ohad, gpain, stephane.leprovost, jstephan,
	linux-kernel, dri-devel, linaro-mm-sig, mturquette,
	bjorn.andersson, christian.koenig, linux-media

Hi Daniel,

On 10/1/20 10:48 AM, Daniel Vetter wrote:
> On Wed, Sep 30, 2020 at 01:53:46PM +0200, Alexandre Bailon wrote:
>> This adds a RPMsg driver that implements communication between the CPU and an
>> APU.
>> This uses VirtIO buffer to exchange messages but for sharing data, this uses
>> a dmabuf, mapped to be shared between CPU (userspace) and APU.
>> The driver is relatively generic, and should work with any SoC implementing
>> hardware accelerator for AI if they use support remoteproc and VirtIO.
>>
>> For the people interested by the firmware or userspace library,
>> the sources are available here:
>> https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu
> Since this has open userspace (from a very cursory look), and smells very
> much like an acceleration driver, and seems to use dma-buf for memory
> management: Why is this not just a drm driver?

I have never though to DRM since for me it was only a RPMsg driver.
I don't know well DRM. Could you tell me how you would do it so I could 
have a look ?

Thanks,
Alexandre

> -Daniel
>
>> Alexandre Bailon (3):
>>    Add a RPMSG driver for the APU in the mt8183
>>    rpmsg: apu_rpmsg: update the way to store IOMMU mapping
>>    rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
>>
>> Julien STEPHAN (1):
>>    rpmsg: apu_rpmsg: Add support for async apu request
>>
>>   drivers/rpmsg/Kconfig          |   9 +
>>   drivers/rpmsg/Makefile         |   1 +
>>   drivers/rpmsg/apu_rpmsg.c      | 752 +++++++++++++++++++++++++++++++++
>>   drivers/rpmsg/apu_rpmsg.h      |  52 +++
>>   include/uapi/linux/apu_rpmsg.h |  47 +++
>>   5 files changed, 861 insertions(+)
>>   create mode 100644 drivers/rpmsg/apu_rpmsg.c
>>   create mode 100644 drivers/rpmsg/apu_rpmsg.h
>>   create mode 100644 include/uapi/linux/apu_rpmsg.h
>>
>> -- 
>> 2.26.2
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU)
  2020-10-01 17:28     ` Alexandre Bailon
@ 2020-10-02  9:35       ` Daniel Vetter
  -1 siblings, 0 replies; 22+ messages in thread
From: Daniel Vetter @ 2020-10-02  9:35 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: linux-remoteproc, ohad, gpain, stephane.leprovost, jstephan,
	linux-kernel, dri-devel, linaro-mm-sig, mturquette,
	bjorn.andersson, christian.koenig, linux-media

On Thu, Oct 01, 2020 at 07:28:27PM +0200, Alexandre Bailon wrote:
> Hi Daniel,
> 
> On 10/1/20 10:48 AM, Daniel Vetter wrote:
> > On Wed, Sep 30, 2020 at 01:53:46PM +0200, Alexandre Bailon wrote:
> > > This adds a RPMsg driver that implements communication between the CPU and an
> > > APU.
> > > This uses VirtIO buffer to exchange messages but for sharing data, this uses
> > > a dmabuf, mapped to be shared between CPU (userspace) and APU.
> > > The driver is relatively generic, and should work with any SoC implementing
> > > hardware accelerator for AI if they use support remoteproc and VirtIO.
> > > 
> > > For the people interested by the firmware or userspace library,
> > > the sources are available here:
> > > https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu
> > Since this has open userspace (from a very cursory look), and smells very
> > much like an acceleration driver, and seems to use dma-buf for memory
> > management: Why is this not just a drm driver?
> 
> I have never though to DRM since for me it was only a RPMsg driver.
> I don't know well DRM. Could you tell me how you would do it so I could have
> a look ?

Well internally it would still be an rpmsg driver ... I'm assuming that's
kinda similar to how most gpu drivers sit on top of a pci_device or a
platform_device, it's just a means to get at your "device"?

The part I'm talking about here is the userspace api. You're creating an
entirely new chardev interface, which at least from a quick look seems to
be based on dma-buf buffers and used to submit commands to your device to
do some kind of computing/processing. That's exactly what drivers/gpu/drm
does (if you ignore the display/modeset side of things) - at the kernel
level gpus have nothing to do with graphics, but all with handling buffer
objects and throwing workloads at some kind of accelerator thing.

Of course that's just my guess of what's going on, after scrolling through
your driver and userspace a bit, I might be completely off. But if my
guess is roughly right, then your driver is internally an rpmsg
driver, but towards userspace it should be a drm driver.

Cheers, Daniel

> 
> Thanks,
> Alexandre
> 
> > -Daniel
> > 
> > > Alexandre Bailon (3):
> > >    Add a RPMSG driver for the APU in the mt8183
> > >    rpmsg: apu_rpmsg: update the way to store IOMMU mapping
> > >    rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
> > > 
> > > Julien STEPHAN (1):
> > >    rpmsg: apu_rpmsg: Add support for async apu request
> > > 
> > >   drivers/rpmsg/Kconfig          |   9 +
> > >   drivers/rpmsg/Makefile         |   1 +
> > >   drivers/rpmsg/apu_rpmsg.c      | 752 +++++++++++++++++++++++++++++++++
> > >   drivers/rpmsg/apu_rpmsg.h      |  52 +++
> > >   include/uapi/linux/apu_rpmsg.h |  47 +++
> > >   5 files changed, 861 insertions(+)
> > >   create mode 100644 drivers/rpmsg/apu_rpmsg.c
> > >   create mode 100644 drivers/rpmsg/apu_rpmsg.h
> > >   create mode 100644 include/uapi/linux/apu_rpmsg.h
> > > 
> > > -- 
> > > 2.26.2
> > > 
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU)
@ 2020-10-02  9:35       ` Daniel Vetter
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel Vetter @ 2020-10-02  9:35 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: ohad, linaro-mm-sig, stephane.leprovost, christian.koenig,
	mturquette, linux-remoteproc, linux-kernel, dri-devel,
	bjorn.andersson, gpain, jstephan, linux-media

On Thu, Oct 01, 2020 at 07:28:27PM +0200, Alexandre Bailon wrote:
> Hi Daniel,
> 
> On 10/1/20 10:48 AM, Daniel Vetter wrote:
> > On Wed, Sep 30, 2020 at 01:53:46PM +0200, Alexandre Bailon wrote:
> > > This adds a RPMsg driver that implements communication between the CPU and an
> > > APU.
> > > This uses VirtIO buffer to exchange messages but for sharing data, this uses
> > > a dmabuf, mapped to be shared between CPU (userspace) and APU.
> > > The driver is relatively generic, and should work with any SoC implementing
> > > hardware accelerator for AI if they use support remoteproc and VirtIO.
> > > 
> > > For the people interested by the firmware or userspace library,
> > > the sources are available here:
> > > https://github.com/BayLibre/open-amp/tree/v2020.01-mtk/apps/examples/apu
> > Since this has open userspace (from a very cursory look), and smells very
> > much like an acceleration driver, and seems to use dma-buf for memory
> > management: Why is this not just a drm driver?
> 
> I have never though to DRM since for me it was only a RPMsg driver.
> I don't know well DRM. Could you tell me how you would do it so I could have
> a look ?

Well internally it would still be an rpmsg driver ... I'm assuming that's
kinda similar to how most gpu drivers sit on top of a pci_device or a
platform_device, it's just a means to get at your "device"?

The part I'm talking about here is the userspace api. You're creating an
entirely new chardev interface, which at least from a quick look seems to
be based on dma-buf buffers and used to submit commands to your device to
do some kind of computing/processing. That's exactly what drivers/gpu/drm
does (if you ignore the display/modeset side of things) - at the kernel
level gpus have nothing to do with graphics, but all with handling buffer
objects and throwing workloads at some kind of accelerator thing.

Of course that's just my guess of what's going on, after scrolling through
your driver and userspace a bit, I might be completely off. But if my
guess is roughly right, then your driver is internally an rpmsg
driver, but towards userspace it should be a drm driver.

Cheers, Daniel

> 
> Thanks,
> Alexandre
> 
> > -Daniel
> > 
> > > Alexandre Bailon (3):
> > >    Add a RPMSG driver for the APU in the mt8183
> > >    rpmsg: apu_rpmsg: update the way to store IOMMU mapping
> > >    rpmsg: apu_rpmsg: Add an IOCTL to request IOMMU mapping
> > > 
> > > Julien STEPHAN (1):
> > >    rpmsg: apu_rpmsg: Add support for async apu request
> > > 
> > >   drivers/rpmsg/Kconfig          |   9 +
> > >   drivers/rpmsg/Makefile         |   1 +
> > >   drivers/rpmsg/apu_rpmsg.c      | 752 +++++++++++++++++++++++++++++++++
> > >   drivers/rpmsg/apu_rpmsg.h      |  52 +++
> > >   include/uapi/linux/apu_rpmsg.h |  47 +++
> > >   5 files changed, 861 insertions(+)
> > >   create mode 100644 drivers/rpmsg/apu_rpmsg.c
> > >   create mode 100644 drivers/rpmsg/apu_rpmsg.h
> > >   create mode 100644 include/uapi/linux/apu_rpmsg.h
> > > 
> > > -- 
> > > 2.26.2
> > > 
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183
  2020-09-30 11:53   ` Alexandre Bailon
@ 2020-10-14 22:55     ` Mathieu Poirier
  -1 siblings, 0 replies; 22+ messages in thread
From: Mathieu Poirier @ 2020-10-14 22:55 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: linux-remoteproc, ohad, bjorn.andersson, sumit.semwal,
	christian.koenig, linux-kernel, linux-media, dri-devel,
	linaro-mm-sig, jstephan, stephane.leprovost, gpain, mturquette

Hi Alexandre,

On Wed, Sep 30, 2020 at 01:53:47PM +0200, Alexandre Bailon wrote:
> This adds a driver to communicate with the APU available
> in the mt8183. The driver is generic and could be used for other APU.
> It mostly provides a userspace interface to send messages and
> and share big buffers with the APU.
> 
> Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
> ---
>  drivers/rpmsg/Kconfig          |   9 +
>  drivers/rpmsg/Makefile         |   1 +
>  drivers/rpmsg/apu_rpmsg.c      | 606 +++++++++++++++++++++++++++++++++
>  drivers/rpmsg/apu_rpmsg.h      |  52 +++
>  include/uapi/linux/apu_rpmsg.h |  36 ++
>  5 files changed, 704 insertions(+)
>  create mode 100644 drivers/rpmsg/apu_rpmsg.c
>  create mode 100644 drivers/rpmsg/apu_rpmsg.h
>  create mode 100644 include/uapi/linux/apu_rpmsg.h
> 
> diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
> index f96716893c2a..3437c6fc8647 100644
> --- a/drivers/rpmsg/Kconfig
> +++ b/drivers/rpmsg/Kconfig
> @@ -64,4 +64,13 @@ config RPMSG_VIRTIO
>  	select RPMSG
>  	select VIRTIO
>  
> +config RPMSG_APU
> +	tristate "APU RPMSG driver"
> +	help
> +	  This provides a RPMSG driver that provides some facilities to
> +	  communicate with an accelerated processing unit (APU).
> +	  This creates one or more char files that could be used by userspace
> +	  to send a message to an APU. In addition, this also take care of
> +	  sharing the memory buffer with the APU.
> +
>  endmenu
> diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
> index ffe932ef6050..93e0f3de99c9 100644
> --- a/drivers/rpmsg/Makefile
> +++ b/drivers/rpmsg/Makefile
> @@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
>  obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
>  obj-$(CONFIG_RPMSG_QCOM_SMD)	+= qcom_smd.o
>  obj-$(CONFIG_RPMSG_VIRTIO)	+= virtio_rpmsg_bus.o
> +obj-$(CONFIG_RPMSG_APU)		+= apu_rpmsg.o
> diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
> new file mode 100644
> index 000000000000..5131b8b8e1f2
> --- /dev/null
> +++ b/drivers/rpmsg/apu_rpmsg.c
> @@ -0,0 +1,606 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// Copyright 2020 BayLibre SAS
> +
> +#include <linux/cdev.h>
> +#include <linux/dma-buf.h>
> +#include <linux/iommu.h>
> +#include <linux/iova.h>
> +#include <linux/types.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/remoteproc.h>
> +#include <linux/rpmsg.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include "rpmsg_internal.h"
> +
> +#include <uapi/linux/apu_rpmsg.h>
> +
> +#include "apu_rpmsg.h"
> +
> +/* Maximum of APU devices supported */
> +#define APU_DEV_MAX 2
> +
> +#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
> +#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
> +
> +struct rpmsg_apu {
> +	struct rpmsg_device *rpdev;
> +	struct cdev cdev;
> +	struct device dev;
> +
> +	struct rproc *rproc;
> +	struct iommu_domain *domain;
> +	struct iova_domain *iovad;
> +	int iova_limit_pfn;
> +};
> +
> +struct rpmsg_request {
> +	struct completion completion;
> +	struct list_head node;
> +	void *req;
> +};
> +
> +struct apu_buffer {
> +	int fd;
> +	struct dma_buf *dma_buf;
> +	struct dma_buf_attachment *attachment;
> +	struct sg_table *sg_table;
> +	u32 iova;
> +};
> +
> +/*
> + * Shared IOVA domain.
> + * The MT8183 has two VP6 core but they are sharing the IOVA.
> + * They could be used alone, or together. In order to avoid conflict,
> + * create an IOVA domain that could be shared by those two core.
> + * @iovad: The IOVA domain to share between the APU cores
> + * @refcount: Allow to automatically release the IOVA domain once all the APU
> + *            cores has been stopped
> + */
> +struct apu_iova_domain {
> +	struct iova_domain iovad;
> +	struct kref refcount;
> +};
> +
> +static dev_t rpmsg_major;
> +static DEFINE_IDA(rpmsg_ctrl_ida);
> +static DEFINE_IDA(rpmsg_minor_ida);
> +static DEFINE_IDA(req_ida);
> +static LIST_HEAD(requests);
> +static struct apu_iova_domain *apu_iovad;
> +
> +static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
> +			      void *priv, u32 addr)
> +{
> +	struct rpmsg_request *rpmsg_req;
> +	struct apu_dev_request *hdr = data;
> +
> +	list_for_each_entry(rpmsg_req, &requests, node) {
> +		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
> +
> +		if (hdr->id == tmp_hdr->id) {
> +			memcpy(rpmsg_req->req, data, count);
> +			complete(&rpmsg_req->completion);
> +
> +			return 0;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int apu_device_memory_map(struct rpmsg_apu *apu,
> +				 struct apu_buffer *buffer)
> +{
> +	struct rpmsg_device *rpdev = apu->rpdev;
> +	phys_addr_t phys;
> +	int total_buf_space;
> +	int iova_pfn;
> +	int ret;
> +
> +	if (!buffer->fd)
> +		return 0;
> +
> +	buffer->dma_buf = dma_buf_get(buffer->fd);
> +	if (IS_ERR(buffer->dma_buf)) {
> +		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
> +			PTR_ERR(buffer->dma_buf));
> +		return PTR_ERR(buffer->dma_buf);
> +	}
> +
> +	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
> +	if (IS_ERR(buffer->attachment)) {
> +		dev_err(&rpdev->dev, "Failed to attach dma_buf\n");
> +		ret = PTR_ERR(buffer->attachment);
> +		goto err_dma_buf_put;
> +	}
> +
> +	buffer->sg_table = dma_buf_map_attachment(buffer->attachment,
> +						   DMA_BIDIRECTIONAL);
> +	if (IS_ERR(buffer->sg_table)) {
> +		dev_err(&rpdev->dev, "Failed to map attachment\n");
> +		ret = PTR_ERR(buffer->sg_table);
> +		goto err_dma_buf_detach;
> +	}
> +	phys = page_to_phys(sg_page(buffer->sg_table->sgl));
> +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
> +
> +	iova_pfn = alloc_iova_fast(apu->iovad, total_buf_space >> PAGE_SHIFT,
> +				   apu->iova_limit_pfn, true);
> +	if (!iova_pfn) {
> +		dev_err(&rpdev->dev, "Failed to allocate iova address\n");
> +		ret = -ENOMEM;
> +		goto err_dma_unmap_attachment;
> +	}
> +
> +	buffer->iova = PFN_PHYS(iova_pfn);
> +	ret = iommu_map(apu->rproc->domain, buffer->iova, phys, total_buf_space,
> +			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
> +	if (ret) {
> +		dev_err(&rpdev->dev, "Failed to iommu map\n");
> +		goto err_free_iova;
> +	}
> +
> +	return 0;
> +
> +err_free_iova:
> +	free_iova(apu->iovad, iova_pfn);
> +err_dma_unmap_attachment:
> +	dma_buf_unmap_attachment(buffer->attachment,
> +				 buffer->sg_table,
> +				 DMA_BIDIRECTIONAL);
> +err_dma_buf_detach:
> +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
> +err_dma_buf_put:
> +	dma_buf_put(buffer->dma_buf);
> +
> +	return ret;
> +}
> +
> +static void apu_device_memory_unmap(struct rpmsg_apu *apu,
> +				    struct apu_buffer *buffer)
> +{
> +	int total_buf_space;
> +
> +	if (!buffer->fd)
> +		return;
> +
> +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
> +	iommu_unmap(apu->rproc->domain, buffer->iova, total_buf_space);
> +	free_iova(apu->iovad, PHYS_PFN(buffer->iova));
> +	dma_buf_unmap_attachment(buffer->attachment,
> +				 buffer->sg_table,
> +				 DMA_BIDIRECTIONAL);
> +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
> +	dma_buf_put(buffer->dma_buf);
> +}
> +
> +static int _apu_send_request(struct rpmsg_apu *apu,
> +			     struct rpmsg_device *rpdev,
> +			     struct apu_dev_request *req, int len)
> +{
> +
> +	struct rpmsg_request *rpmsg_req;
> +	int ret = 0;
> +
> +	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
> +	if (req->id < 0)
> +		return ret;
> +
> +	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
> +	if (!rpmsg_req)
> +		return -ENOMEM;
> +
> +	rpmsg_req->req = req;
> +	init_completion(&rpmsg_req->completion);
> +	list_add(&rpmsg_req->node, &requests);
> +
> +	ret = rpmsg_send(rpdev->ept, req, len);
> +	if (ret)
> +		goto free_req;
> +
> +	/* be careful with race here between timeout and callback*/
> +	ret = wait_for_completion_timeout(&rpmsg_req->completion,
> +					  msecs_to_jiffies(1000));
> +	if (!ret)
> +		ret = -ETIMEDOUT;
> +	else
> +		ret = 0;
> +
> +	ida_simple_remove(&req_ida, req->id);
> +
> +free_req:
> +
> +	list_del(&rpmsg_req->node);
> +	kfree(rpmsg_req);
> +
> +	return ret;
> +}
> +
> +static int apu_send_request(struct rpmsg_apu *apu,
> +			    struct apu_request *req)
> +{
> +	int ret;
> +	struct rpmsg_device *rpdev = apu->rpdev;
> +	struct apu_dev_request *dev_req;
> +	struct apu_buffer *buffer;
> +
> +	int size = req->size_in + req->size_out +
> +		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
> +	u32 *fd = (u32 *)(req->data + req->size_in + req->size_out);
> +	u32 *buffer_size = (u32 *)(fd + req->count);
> +	u32 *dev_req_da;
> +	u32 *dev_req_buffer_size;
> +	int i;
> +
> +	dev_req = kmalloc(size, GFP_KERNEL);
> +	if (!dev_req)
> +		return -ENOMEM;
> +
> +	dev_req->cmd = req->cmd;
> +	dev_req->size_in = req->size_in;
> +	dev_req->size_out = req->size_out;
> +	dev_req->count = req->count;
> +	dev_req_da = (u32 *)(dev_req->data + req->size_in + req->size_out);

I have started to review this set but it will take me more time to wrap my head
around what you are doing (the overall lack of comments in the code doesn't
help).

In the mean time the "dev_req->data" above is very puzzling to me - did you mean
to write "req-data"?  Otherwise I don't know how this can work since
dev_req->data is not initalised after the kmalloc(). 

More comments will come tomorrow.

Thanks,
Mathieu

> +	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
> +	memcpy(dev_req->data, req->data, req->size_in);
> +
> +	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
> +	for (i = 0; i < req->count; i++) {
> +		buffer[i].fd = fd[i];
> +		ret = apu_device_memory_map(apu, &buffer[i]);
> +		if (ret)
> +			goto err_free_memory;
> +		dev_req_da[i] = buffer[i].iova;
> +		dev_req_buffer_size[i] = buffer_size[i];
> +	}
> +
> +	ret = _apu_send_request(apu, rpdev, dev_req, size);
> +
> +err_free_memory:
> +	for (i--; i >= 0; i--)
> +		apu_device_memory_unmap(apu, &buffer[i]);
> +
> +	req->result = dev_req->result;
> +	req->size_in = dev_req->size_in;
> +	req->size_out = dev_req->size_out;
> +	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
> +	       sizeof(u32) * req->count);
> +
> +	kfree(buffer);
> +	kfree(dev_req);
> +
> +	return ret;
> +}
> +
> +
> +static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
> +			       unsigned long arg)
> +{
> +	struct rpmsg_apu *apu = fp->private_data;
> +	struct apu_request apu_req;
> +	struct apu_request *apu_req_full;
> +	void __user *argp = (void __user *)arg;
> +	int len;
> +	int ret;
> +
> +	switch (cmd) {
> +	case APU_SEND_REQ_IOCTL:
> +		/* Get the header */
> +		if (copy_from_user(&apu_req, argp,
> +				   sizeof(apu_req)))
> +			return -EFAULT;
> +
> +		len = sizeof(*apu_req_full) + apu_req.size_in +
> +			apu_req.size_out + apu_req.count * sizeof(u32) * 2;
> +		apu_req_full = kzalloc(len, GFP_KERNEL);
> +		if (!apu_req_full)
> +			return -ENOMEM;
> +
> +		/* Get the whole request */
> +		if (copy_from_user(apu_req_full, argp, len)) {
> +			kfree(apu_req_full);
> +			return -EFAULT;
> +		}
> +
> +		ret = apu_send_request(apu, apu_req_full);
> +		if (ret) {
> +			kfree(apu_req_full);
> +			return ret;
> +		}
> +
> +		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
> +				 sizeof(u32) * apu_req_full->count +
> +				 apu_req_full->size_in + apu_req_full->size_out))
> +			ret = -EFAULT;
> +
> +		kfree(apu_req_full);
> +		return ret;
> +
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
> +{
> +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
> +
> +	get_device(&apu->dev);
> +	filp->private_data = apu;
> +
> +	return 0;
> +}
> +
> +static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
> +{
> +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
> +
> +	put_device(&apu->dev);
> +
> +	return 0;
> +}
> +
> +static const struct file_operations rpmsg_eptdev_fops = {
> +	.owner = THIS_MODULE,
> +	.open = rpmsg_eptdev_open,
> +	.release = rpmsg_eptdev_release,
> +	.unlocked_ioctl = rpmsg_eptdev_ioctl,
> +	.compat_ioctl = rpmsg_eptdev_ioctl,
> +};
> +
> +static void iova_domain_release(struct kref *ref)
> +{
> +	put_iova_domain(&apu_iovad->iovad);
> +	kfree(apu_iovad);
> +	apu_iovad = NULL;
> +}
> +
> +static struct fw_rsc_iova *apu_find_rcs_iova(struct rpmsg_apu *apu)
> +{
> +	struct rproc *rproc = apu->rproc;
> +	struct resource_table *table;
> +	struct fw_rsc_iova *rsc;
> +	int i;
> +
> +	table = rproc->table_ptr;
> +	for (i = 0; i < table->num; i++) {
> +		int offset = table->offset[i];
> +		struct fw_rsc_hdr *hdr = (void *)table + offset;
> +
> +		switch (hdr->type) {
> +		case RSC_VENDOR_IOVA:
> +			rsc = (void *)hdr + sizeof(*hdr);
> +				return rsc;
> +			break;
> +		default:
> +			continue;
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +static int apu_reserve_iova(struct rpmsg_apu *apu, struct iova_domain *iovad)
> +{
> +	struct rproc *rproc = apu->rproc;
> +	struct resource_table *table;
> +	struct fw_rsc_carveout *rsc;
> +	int i;
> +
> +	table = rproc->table_ptr;
> +	for (i = 0; i < table->num; i++) {
> +		int offset = table->offset[i];
> +		struct fw_rsc_hdr *hdr = (void *)table + offset;
> +
> +		if (hdr->type == RSC_CARVEOUT) {
> +			struct iova *iova;
> +
> +			rsc = (void *)hdr + sizeof(*hdr);
> +			iova = reserve_iova(iovad, PHYS_PFN(rsc->da),
> +					    PHYS_PFN(rsc->da + rsc->len));
> +			if (!iova) {
> +				dev_err(&apu->dev, "failed to reserve iova\n");
> +				return -ENOMEM;
> +			}
> +			dev_dbg(&apu->dev, "Reserve: %x - %x\n",
> +				rsc->da, rsc->da + rsc->len);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int apu_init_iovad(struct rpmsg_apu *apu)
> +{
> +	struct fw_rsc_iova *rsc;
> +
> +	if (!apu->rproc->table_ptr) {
> +		dev_err(&apu->dev,
> +			"No resource_table: has the firmware been loaded ?\n");
> +		return -ENODEV;
> +	}
> +
> +	rsc = apu_find_rcs_iova(apu);
> +	if (!rsc) {
> +		dev_err(&apu->dev, "No iova range defined in resource_table\n");
> +		return -ENOMEM;
> +	}
> +
> +	if (!apu_iovad) {
> +		apu_iovad = kzalloc(sizeof(*apu_iovad), GFP_KERNEL);
> +		if (!apu_iovad)
> +			return -ENOMEM;
> +
> +		init_iova_domain(&apu_iovad->iovad, PAGE_SIZE,
> +				 PHYS_PFN(rsc->da));
> +		apu_reserve_iova(apu, &apu_iovad->iovad);
> +		kref_init(&apu_iovad->refcount);
> +	} else
> +		kref_get(&apu_iovad->refcount);
> +
> +	apu->iovad = &apu_iovad->iovad;
> +	apu->iova_limit_pfn = PHYS_PFN(rsc->da + rsc->len) - 1;
> +
> +	return 0;
> +}
> +
> +static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
> +{
> +	/*
> +	 * To work, the APU RPMsg driver need to get the rproc device.
> +	 * Currently, we only use virtio so we could use that to find the
> +	 * remoteproc parent.
> +	 */
> +	if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
> +		dev_err(&rpdev->dev, "invalid rpmsg device\n");
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
> +		dev_err(&rpdev->dev, "unsupported bus\n");
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
> +}
> +
> +static void rpmsg_apu_release_device(struct device *dev)
> +{
> +	struct rpmsg_apu *apu = dev_to_apu(dev);
> +
> +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> +	cdev_del(&apu->cdev);
> +	kfree(apu);
> +}
> +
> +static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
> +{
> +	struct rpmsg_apu *apu;
> +	struct device *dev;
> +	int ret;
> +
> +	apu = devm_kzalloc(&rpdev->dev, sizeof(*apu), GFP_KERNEL);
> +	if (!apu)
> +		return -ENOMEM;
> +	apu->rpdev = rpdev;
> +
> +	apu->rproc = apu_get_rproc(rpdev);
> +	if (IS_ERR_OR_NULL(apu->rproc))
> +		return PTR_ERR(apu->rproc);
> +
> +	dev = &apu->dev;
> +	device_initialize(dev);
> +	dev->parent = &rpdev->dev;
> +
> +	cdev_init(&apu->cdev, &rpmsg_eptdev_fops);
> +	apu->cdev.owner = THIS_MODULE;
> +
> +	ret = ida_simple_get(&rpmsg_minor_ida, 0, APU_DEV_MAX, GFP_KERNEL);
> +	if (ret < 0)
> +		goto free_apu;
> +	dev->devt = MKDEV(MAJOR(rpmsg_major), ret);
> +
> +	ret = ida_simple_get(&rpmsg_ctrl_ida, 0, 0, GFP_KERNEL);
> +	if (ret < 0)
> +		goto free_minor_ida;
> +	dev->id = ret;
> +	dev_set_name(&apu->dev, "apu%d", ret);
> +
> +	ret = cdev_add(&apu->cdev, dev->devt, 1);
> +	if (ret)
> +		goto free_ctrl_ida;
> +
> +	/* We can now rely on the release function for cleanup */
> +	dev->release = rpmsg_apu_release_device;
> +
> +	ret = device_add(dev);
> +	if (ret) {
> +		dev_err(&rpdev->dev, "device_add failed: %d\n", ret);
> +		put_device(dev);
> +	}
> +
> +	/* Make device dma capable by inheriting from parent's capabilities */
> +	set_dma_ops(&rpdev->dev, get_dma_ops(apu->rproc->dev.parent));
> +
> +	ret = dma_coerce_mask_and_coherent(&rpdev->dev,
> +					   dma_get_mask(apu->rproc->dev.parent));
> +	if (ret)
> +		goto err_put_device;
> +
> +	rpdev->dev.iommu_group = apu->rproc->dev.parent->iommu_group;
> +
> +	ret = apu_init_iovad(apu);
> +
> +	dev_set_drvdata(&rpdev->dev, apu);
> +
> +	return ret;
> +
> +err_put_device:
> +	put_device(dev);
> +free_ctrl_ida:
> +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> +free_minor_ida:
> +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> +free_apu:
> +	put_device(dev);
> +	kfree(apu);
> +
> +	return ret;
> +}
> +
> +static void apu_rpmsg_remove(struct rpmsg_device *rpdev)
> +{
> +	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
> +
> +	if (apu_iovad)
> +		kref_put(&apu_iovad->refcount, iova_domain_release);
> +
> +	device_del(&apu->dev);
> +	put_device(&apu->dev);
> +	kfree(apu);
> +}
> +
> +static const struct rpmsg_device_id apu_rpmsg_match[] = {
> +	{ APU_RPMSG_SERVICE_MT8183 },
> +	{}
> +};
> +
> +static struct rpmsg_driver apu_rpmsg_driver = {
> +	.probe = apu_rpmsg_probe,
> +	.remove = apu_rpmsg_remove,
> +	.callback = apu_rpmsg_callback,
> +	.id_table = apu_rpmsg_match,
> +	.drv  = {
> +		.name  = "apu_rpmsg",
> +	},
> +};
> +
> +static int __init apu_rpmsg_init(void)
> +{
> +	int ret;
> +
> +	ret = alloc_chrdev_region(&rpmsg_major, 0, APU_DEV_MAX, "apu");
> +	if (ret < 0) {
> +		pr_err("apu: failed to allocate char dev region\n");
> +		return ret;
> +	}
> +
> +	return register_rpmsg_driver(&apu_rpmsg_driver);
> +}
> +arch_initcall(apu_rpmsg_init);
> +
> +static void __exit apu_rpmsg_exit(void)
> +{
> +	unregister_rpmsg_driver(&apu_rpmsg_driver);
> +}
> +module_exit(apu_rpmsg_exit);
> +
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("APU RPMSG driver");
> diff --git a/drivers/rpmsg/apu_rpmsg.h b/drivers/rpmsg/apu_rpmsg.h
> new file mode 100644
> index 000000000000..54b5b7880750
> --- /dev/null
> +++ b/drivers/rpmsg/apu_rpmsg.h
> @@ -0,0 +1,52 @@
> +/* SPDX-License-Identifier: GPL-2.0
> + *
> + * Copyright 2020 BayLibre SAS
> + */
> +
> +#ifndef __APU_RPMSG_H__
> +#define __APU_RPMSG_H__
> +
> +/*
> + * Firmware request, must be aligned with the one defined in firmware.
> + * @id: Request id, used in the case of reply, to find the pending request
> + * @cmd: The command id to execute in the firmware
> + * @result: The result of the command executed on the firmware
> + * @size: The size of the data available in this request
> + * @count: The number of shared buffer
> + * @data: Contains the data attached with the request if size is greater than
> + *        zero, and the addresses of shared buffers if count is greater than
> + *        zero. Both the data and the shared buffer could be read and write
> + *        by the APU.
> + */
> +struct  apu_dev_request {
> +	u16 id;
> +	u16 cmd;
> +	u16 result;
> +	u16 size_in;
> +	u16 size_out;
> +	u16 count;
> +	u8 data[0];
> +} __packed;
> +
> +#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
> +#define APU_CTRL_SRC 1
> +#define APU_CTRL_DST 1
> +
> +/* Vendor specific resource table entry */
> +#define RSC_VENDOR_IOVA 128
> +
> +/*
> + * Firmware IOVA resource table entry
> + * Define a range of virtual device address that could mapped using the IOMMU.
> + * @da: Start virtual device address
> + * @len: Length of the virtual device address
> + * @name: name of the resource
> + */
> +struct fw_rsc_iova {
> +	u32 da;
> +	u32 len;
> +	u32 reserved;
> +	u8 name[32];
> +} __packed;
> +
> +#endif /* __APU_RPMSG_H__ */
> diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
> new file mode 100644
> index 000000000000..81c9e4af9a94
> --- /dev/null
> +++ b/include/uapi/linux/apu_rpmsg.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2020 BayLibre
> + */
> +
> +#ifndef _UAPI_RPMSG_APU_H_
> +#define _UAPI_RPMSG_APU_H_
> +
> +#include <linux/ioctl.h>
> +#include <linux/types.h>
> +
> +/*
> + * Structure containing the APU request from userspace application
> + * @cmd: The id of the command to execute on the APU
> + * @result: The result of the command executed on the APU
> + * @size: The size of the data available in this request
> + * @count: The number of shared buffer
> + * @data: Contains the data attached with the request if size is greater than
> + *        zero, and the files descriptors of shared buffers if count is greater
> + *        than zero. Both the data and the shared buffer could be read and write
> + *        by the APU.
> + */
> +struct apu_request {
> +	__u16 cmd;
> +	__u16 result;
> +	__u16 size_in;
> +	__u16 size_out;
> +	__u16 count;
> +	__u16 reserved;
> +	__u8 data[0];
> +};
> +
> +/* Send synchronous request to an APU */
> +#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
> +
> +#endif
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183
@ 2020-10-14 22:55     ` Mathieu Poirier
  0 siblings, 0 replies; 22+ messages in thread
From: Mathieu Poirier @ 2020-10-14 22:55 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: ohad, gpain, stephane.leprovost, jstephan, linux-remoteproc,
	linux-kernel, dri-devel, linaro-mm-sig, mturquette,
	bjorn.andersson, christian.koenig, linux-media

Hi Alexandre,

On Wed, Sep 30, 2020 at 01:53:47PM +0200, Alexandre Bailon wrote:
> This adds a driver to communicate with the APU available
> in the mt8183. The driver is generic and could be used for other APU.
> It mostly provides a userspace interface to send messages and
> and share big buffers with the APU.
> 
> Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
> ---
>  drivers/rpmsg/Kconfig          |   9 +
>  drivers/rpmsg/Makefile         |   1 +
>  drivers/rpmsg/apu_rpmsg.c      | 606 +++++++++++++++++++++++++++++++++
>  drivers/rpmsg/apu_rpmsg.h      |  52 +++
>  include/uapi/linux/apu_rpmsg.h |  36 ++
>  5 files changed, 704 insertions(+)
>  create mode 100644 drivers/rpmsg/apu_rpmsg.c
>  create mode 100644 drivers/rpmsg/apu_rpmsg.h
>  create mode 100644 include/uapi/linux/apu_rpmsg.h
> 
> diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
> index f96716893c2a..3437c6fc8647 100644
> --- a/drivers/rpmsg/Kconfig
> +++ b/drivers/rpmsg/Kconfig
> @@ -64,4 +64,13 @@ config RPMSG_VIRTIO
>  	select RPMSG
>  	select VIRTIO
>  
> +config RPMSG_APU
> +	tristate "APU RPMSG driver"
> +	help
> +	  This provides a RPMSG driver that provides some facilities to
> +	  communicate with an accelerated processing unit (APU).
> +	  This creates one or more char files that could be used by userspace
> +	  to send a message to an APU. In addition, this also take care of
> +	  sharing the memory buffer with the APU.
> +
>  endmenu
> diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
> index ffe932ef6050..93e0f3de99c9 100644
> --- a/drivers/rpmsg/Makefile
> +++ b/drivers/rpmsg/Makefile
> @@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
>  obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
>  obj-$(CONFIG_RPMSG_QCOM_SMD)	+= qcom_smd.o
>  obj-$(CONFIG_RPMSG_VIRTIO)	+= virtio_rpmsg_bus.o
> +obj-$(CONFIG_RPMSG_APU)		+= apu_rpmsg.o
> diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
> new file mode 100644
> index 000000000000..5131b8b8e1f2
> --- /dev/null
> +++ b/drivers/rpmsg/apu_rpmsg.c
> @@ -0,0 +1,606 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// Copyright 2020 BayLibre SAS
> +
> +#include <linux/cdev.h>
> +#include <linux/dma-buf.h>
> +#include <linux/iommu.h>
> +#include <linux/iova.h>
> +#include <linux/types.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/remoteproc.h>
> +#include <linux/rpmsg.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include "rpmsg_internal.h"
> +
> +#include <uapi/linux/apu_rpmsg.h>
> +
> +#include "apu_rpmsg.h"
> +
> +/* Maximum of APU devices supported */
> +#define APU_DEV_MAX 2
> +
> +#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
> +#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
> +
> +struct rpmsg_apu {
> +	struct rpmsg_device *rpdev;
> +	struct cdev cdev;
> +	struct device dev;
> +
> +	struct rproc *rproc;
> +	struct iommu_domain *domain;
> +	struct iova_domain *iovad;
> +	int iova_limit_pfn;
> +};
> +
> +struct rpmsg_request {
> +	struct completion completion;
> +	struct list_head node;
> +	void *req;
> +};
> +
> +struct apu_buffer {
> +	int fd;
> +	struct dma_buf *dma_buf;
> +	struct dma_buf_attachment *attachment;
> +	struct sg_table *sg_table;
> +	u32 iova;
> +};
> +
> +/*
> + * Shared IOVA domain.
> + * The MT8183 has two VP6 core but they are sharing the IOVA.
> + * They could be used alone, or together. In order to avoid conflict,
> + * create an IOVA domain that could be shared by those two core.
> + * @iovad: The IOVA domain to share between the APU cores
> + * @refcount: Allow to automatically release the IOVA domain once all the APU
> + *            cores has been stopped
> + */
> +struct apu_iova_domain {
> +	struct iova_domain iovad;
> +	struct kref refcount;
> +};
> +
> +static dev_t rpmsg_major;
> +static DEFINE_IDA(rpmsg_ctrl_ida);
> +static DEFINE_IDA(rpmsg_minor_ida);
> +static DEFINE_IDA(req_ida);
> +static LIST_HEAD(requests);
> +static struct apu_iova_domain *apu_iovad;
> +
> +static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
> +			      void *priv, u32 addr)
> +{
> +	struct rpmsg_request *rpmsg_req;
> +	struct apu_dev_request *hdr = data;
> +
> +	list_for_each_entry(rpmsg_req, &requests, node) {
> +		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
> +
> +		if (hdr->id == tmp_hdr->id) {
> +			memcpy(rpmsg_req->req, data, count);
> +			complete(&rpmsg_req->completion);
> +
> +			return 0;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int apu_device_memory_map(struct rpmsg_apu *apu,
> +				 struct apu_buffer *buffer)
> +{
> +	struct rpmsg_device *rpdev = apu->rpdev;
> +	phys_addr_t phys;
> +	int total_buf_space;
> +	int iova_pfn;
> +	int ret;
> +
> +	if (!buffer->fd)
> +		return 0;
> +
> +	buffer->dma_buf = dma_buf_get(buffer->fd);
> +	if (IS_ERR(buffer->dma_buf)) {
> +		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
> +			PTR_ERR(buffer->dma_buf));
> +		return PTR_ERR(buffer->dma_buf);
> +	}
> +
> +	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
> +	if (IS_ERR(buffer->attachment)) {
> +		dev_err(&rpdev->dev, "Failed to attach dma_buf\n");
> +		ret = PTR_ERR(buffer->attachment);
> +		goto err_dma_buf_put;
> +	}
> +
> +	buffer->sg_table = dma_buf_map_attachment(buffer->attachment,
> +						   DMA_BIDIRECTIONAL);
> +	if (IS_ERR(buffer->sg_table)) {
> +		dev_err(&rpdev->dev, "Failed to map attachment\n");
> +		ret = PTR_ERR(buffer->sg_table);
> +		goto err_dma_buf_detach;
> +	}
> +	phys = page_to_phys(sg_page(buffer->sg_table->sgl));
> +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
> +
> +	iova_pfn = alloc_iova_fast(apu->iovad, total_buf_space >> PAGE_SHIFT,
> +				   apu->iova_limit_pfn, true);
> +	if (!iova_pfn) {
> +		dev_err(&rpdev->dev, "Failed to allocate iova address\n");
> +		ret = -ENOMEM;
> +		goto err_dma_unmap_attachment;
> +	}
> +
> +	buffer->iova = PFN_PHYS(iova_pfn);
> +	ret = iommu_map(apu->rproc->domain, buffer->iova, phys, total_buf_space,
> +			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
> +	if (ret) {
> +		dev_err(&rpdev->dev, "Failed to iommu map\n");
> +		goto err_free_iova;
> +	}
> +
> +	return 0;
> +
> +err_free_iova:
> +	free_iova(apu->iovad, iova_pfn);
> +err_dma_unmap_attachment:
> +	dma_buf_unmap_attachment(buffer->attachment,
> +				 buffer->sg_table,
> +				 DMA_BIDIRECTIONAL);
> +err_dma_buf_detach:
> +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
> +err_dma_buf_put:
> +	dma_buf_put(buffer->dma_buf);
> +
> +	return ret;
> +}
> +
> +static void apu_device_memory_unmap(struct rpmsg_apu *apu,
> +				    struct apu_buffer *buffer)
> +{
> +	int total_buf_space;
> +
> +	if (!buffer->fd)
> +		return;
> +
> +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
> +	iommu_unmap(apu->rproc->domain, buffer->iova, total_buf_space);
> +	free_iova(apu->iovad, PHYS_PFN(buffer->iova));
> +	dma_buf_unmap_attachment(buffer->attachment,
> +				 buffer->sg_table,
> +				 DMA_BIDIRECTIONAL);
> +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
> +	dma_buf_put(buffer->dma_buf);
> +}
> +
> +static int _apu_send_request(struct rpmsg_apu *apu,
> +			     struct rpmsg_device *rpdev,
> +			     struct apu_dev_request *req, int len)
> +{
> +
> +	struct rpmsg_request *rpmsg_req;
> +	int ret = 0;
> +
> +	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
> +	if (req->id < 0)
> +		return ret;
> +
> +	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
> +	if (!rpmsg_req)
> +		return -ENOMEM;
> +
> +	rpmsg_req->req = req;
> +	init_completion(&rpmsg_req->completion);
> +	list_add(&rpmsg_req->node, &requests);
> +
> +	ret = rpmsg_send(rpdev->ept, req, len);
> +	if (ret)
> +		goto free_req;
> +
> +	/* be careful with race here between timeout and callback*/
> +	ret = wait_for_completion_timeout(&rpmsg_req->completion,
> +					  msecs_to_jiffies(1000));
> +	if (!ret)
> +		ret = -ETIMEDOUT;
> +	else
> +		ret = 0;
> +
> +	ida_simple_remove(&req_ida, req->id);
> +
> +free_req:
> +
> +	list_del(&rpmsg_req->node);
> +	kfree(rpmsg_req);
> +
> +	return ret;
> +}
> +
> +static int apu_send_request(struct rpmsg_apu *apu,
> +			    struct apu_request *req)
> +{
> +	int ret;
> +	struct rpmsg_device *rpdev = apu->rpdev;
> +	struct apu_dev_request *dev_req;
> +	struct apu_buffer *buffer;
> +
> +	int size = req->size_in + req->size_out +
> +		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
> +	u32 *fd = (u32 *)(req->data + req->size_in + req->size_out);
> +	u32 *buffer_size = (u32 *)(fd + req->count);
> +	u32 *dev_req_da;
> +	u32 *dev_req_buffer_size;
> +	int i;
> +
> +	dev_req = kmalloc(size, GFP_KERNEL);
> +	if (!dev_req)
> +		return -ENOMEM;
> +
> +	dev_req->cmd = req->cmd;
> +	dev_req->size_in = req->size_in;
> +	dev_req->size_out = req->size_out;
> +	dev_req->count = req->count;
> +	dev_req_da = (u32 *)(dev_req->data + req->size_in + req->size_out);

I have started to review this set but it will take me more time to wrap my head
around what you are doing (the overall lack of comments in the code doesn't
help).

In the mean time the "dev_req->data" above is very puzzling to me - did you mean
to write "req-data"?  Otherwise I don't know how this can work since
dev_req->data is not initalised after the kmalloc(). 

More comments will come tomorrow.

Thanks,
Mathieu

> +	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
> +	memcpy(dev_req->data, req->data, req->size_in);
> +
> +	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
> +	for (i = 0; i < req->count; i++) {
> +		buffer[i].fd = fd[i];
> +		ret = apu_device_memory_map(apu, &buffer[i]);
> +		if (ret)
> +			goto err_free_memory;
> +		dev_req_da[i] = buffer[i].iova;
> +		dev_req_buffer_size[i] = buffer_size[i];
> +	}
> +
> +	ret = _apu_send_request(apu, rpdev, dev_req, size);
> +
> +err_free_memory:
> +	for (i--; i >= 0; i--)
> +		apu_device_memory_unmap(apu, &buffer[i]);
> +
> +	req->result = dev_req->result;
> +	req->size_in = dev_req->size_in;
> +	req->size_out = dev_req->size_out;
> +	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
> +	       sizeof(u32) * req->count);
> +
> +	kfree(buffer);
> +	kfree(dev_req);
> +
> +	return ret;
> +}
> +
> +
> +static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
> +			       unsigned long arg)
> +{
> +	struct rpmsg_apu *apu = fp->private_data;
> +	struct apu_request apu_req;
> +	struct apu_request *apu_req_full;
> +	void __user *argp = (void __user *)arg;
> +	int len;
> +	int ret;
> +
> +	switch (cmd) {
> +	case APU_SEND_REQ_IOCTL:
> +		/* Get the header */
> +		if (copy_from_user(&apu_req, argp,
> +				   sizeof(apu_req)))
> +			return -EFAULT;
> +
> +		len = sizeof(*apu_req_full) + apu_req.size_in +
> +			apu_req.size_out + apu_req.count * sizeof(u32) * 2;
> +		apu_req_full = kzalloc(len, GFP_KERNEL);
> +		if (!apu_req_full)
> +			return -ENOMEM;
> +
> +		/* Get the whole request */
> +		if (copy_from_user(apu_req_full, argp, len)) {
> +			kfree(apu_req_full);
> +			return -EFAULT;
> +		}
> +
> +		ret = apu_send_request(apu, apu_req_full);
> +		if (ret) {
> +			kfree(apu_req_full);
> +			return ret;
> +		}
> +
> +		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
> +				 sizeof(u32) * apu_req_full->count +
> +				 apu_req_full->size_in + apu_req_full->size_out))
> +			ret = -EFAULT;
> +
> +		kfree(apu_req_full);
> +		return ret;
> +
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
> +{
> +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
> +
> +	get_device(&apu->dev);
> +	filp->private_data = apu;
> +
> +	return 0;
> +}
> +
> +static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
> +{
> +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
> +
> +	put_device(&apu->dev);
> +
> +	return 0;
> +}
> +
> +static const struct file_operations rpmsg_eptdev_fops = {
> +	.owner = THIS_MODULE,
> +	.open = rpmsg_eptdev_open,
> +	.release = rpmsg_eptdev_release,
> +	.unlocked_ioctl = rpmsg_eptdev_ioctl,
> +	.compat_ioctl = rpmsg_eptdev_ioctl,
> +};
> +
> +static void iova_domain_release(struct kref *ref)
> +{
> +	put_iova_domain(&apu_iovad->iovad);
> +	kfree(apu_iovad);
> +	apu_iovad = NULL;
> +}
> +
> +static struct fw_rsc_iova *apu_find_rcs_iova(struct rpmsg_apu *apu)
> +{
> +	struct rproc *rproc = apu->rproc;
> +	struct resource_table *table;
> +	struct fw_rsc_iova *rsc;
> +	int i;
> +
> +	table = rproc->table_ptr;
> +	for (i = 0; i < table->num; i++) {
> +		int offset = table->offset[i];
> +		struct fw_rsc_hdr *hdr = (void *)table + offset;
> +
> +		switch (hdr->type) {
> +		case RSC_VENDOR_IOVA:
> +			rsc = (void *)hdr + sizeof(*hdr);
> +				return rsc;
> +			break;
> +		default:
> +			continue;
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +static int apu_reserve_iova(struct rpmsg_apu *apu, struct iova_domain *iovad)
> +{
> +	struct rproc *rproc = apu->rproc;
> +	struct resource_table *table;
> +	struct fw_rsc_carveout *rsc;
> +	int i;
> +
> +	table = rproc->table_ptr;
> +	for (i = 0; i < table->num; i++) {
> +		int offset = table->offset[i];
> +		struct fw_rsc_hdr *hdr = (void *)table + offset;
> +
> +		if (hdr->type == RSC_CARVEOUT) {
> +			struct iova *iova;
> +
> +			rsc = (void *)hdr + sizeof(*hdr);
> +			iova = reserve_iova(iovad, PHYS_PFN(rsc->da),
> +					    PHYS_PFN(rsc->da + rsc->len));
> +			if (!iova) {
> +				dev_err(&apu->dev, "failed to reserve iova\n");
> +				return -ENOMEM;
> +			}
> +			dev_dbg(&apu->dev, "Reserve: %x - %x\n",
> +				rsc->da, rsc->da + rsc->len);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int apu_init_iovad(struct rpmsg_apu *apu)
> +{
> +	struct fw_rsc_iova *rsc;
> +
> +	if (!apu->rproc->table_ptr) {
> +		dev_err(&apu->dev,
> +			"No resource_table: has the firmware been loaded ?\n");
> +		return -ENODEV;
> +	}
> +
> +	rsc = apu_find_rcs_iova(apu);
> +	if (!rsc) {
> +		dev_err(&apu->dev, "No iova range defined in resource_table\n");
> +		return -ENOMEM;
> +	}
> +
> +	if (!apu_iovad) {
> +		apu_iovad = kzalloc(sizeof(*apu_iovad), GFP_KERNEL);
> +		if (!apu_iovad)
> +			return -ENOMEM;
> +
> +		init_iova_domain(&apu_iovad->iovad, PAGE_SIZE,
> +				 PHYS_PFN(rsc->da));
> +		apu_reserve_iova(apu, &apu_iovad->iovad);
> +		kref_init(&apu_iovad->refcount);
> +	} else
> +		kref_get(&apu_iovad->refcount);
> +
> +	apu->iovad = &apu_iovad->iovad;
> +	apu->iova_limit_pfn = PHYS_PFN(rsc->da + rsc->len) - 1;
> +
> +	return 0;
> +}
> +
> +static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
> +{
> +	/*
> +	 * To work, the APU RPMsg driver need to get the rproc device.
> +	 * Currently, we only use virtio so we could use that to find the
> +	 * remoteproc parent.
> +	 */
> +	if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
> +		dev_err(&rpdev->dev, "invalid rpmsg device\n");
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
> +		dev_err(&rpdev->dev, "unsupported bus\n");
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
> +}
> +
> +static void rpmsg_apu_release_device(struct device *dev)
> +{
> +	struct rpmsg_apu *apu = dev_to_apu(dev);
> +
> +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> +	cdev_del(&apu->cdev);
> +	kfree(apu);
> +}
> +
> +static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
> +{
> +	struct rpmsg_apu *apu;
> +	struct device *dev;
> +	int ret;
> +
> +	apu = devm_kzalloc(&rpdev->dev, sizeof(*apu), GFP_KERNEL);
> +	if (!apu)
> +		return -ENOMEM;
> +	apu->rpdev = rpdev;
> +
> +	apu->rproc = apu_get_rproc(rpdev);
> +	if (IS_ERR_OR_NULL(apu->rproc))
> +		return PTR_ERR(apu->rproc);
> +
> +	dev = &apu->dev;
> +	device_initialize(dev);
> +	dev->parent = &rpdev->dev;
> +
> +	cdev_init(&apu->cdev, &rpmsg_eptdev_fops);
> +	apu->cdev.owner = THIS_MODULE;
> +
> +	ret = ida_simple_get(&rpmsg_minor_ida, 0, APU_DEV_MAX, GFP_KERNEL);
> +	if (ret < 0)
> +		goto free_apu;
> +	dev->devt = MKDEV(MAJOR(rpmsg_major), ret);
> +
> +	ret = ida_simple_get(&rpmsg_ctrl_ida, 0, 0, GFP_KERNEL);
> +	if (ret < 0)
> +		goto free_minor_ida;
> +	dev->id = ret;
> +	dev_set_name(&apu->dev, "apu%d", ret);
> +
> +	ret = cdev_add(&apu->cdev, dev->devt, 1);
> +	if (ret)
> +		goto free_ctrl_ida;
> +
> +	/* We can now rely on the release function for cleanup */
> +	dev->release = rpmsg_apu_release_device;
> +
> +	ret = device_add(dev);
> +	if (ret) {
> +		dev_err(&rpdev->dev, "device_add failed: %d\n", ret);
> +		put_device(dev);
> +	}
> +
> +	/* Make device dma capable by inheriting from parent's capabilities */
> +	set_dma_ops(&rpdev->dev, get_dma_ops(apu->rproc->dev.parent));
> +
> +	ret = dma_coerce_mask_and_coherent(&rpdev->dev,
> +					   dma_get_mask(apu->rproc->dev.parent));
> +	if (ret)
> +		goto err_put_device;
> +
> +	rpdev->dev.iommu_group = apu->rproc->dev.parent->iommu_group;
> +
> +	ret = apu_init_iovad(apu);
> +
> +	dev_set_drvdata(&rpdev->dev, apu);
> +
> +	return ret;
> +
> +err_put_device:
> +	put_device(dev);
> +free_ctrl_ida:
> +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> +free_minor_ida:
> +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> +free_apu:
> +	put_device(dev);
> +	kfree(apu);
> +
> +	return ret;
> +}
> +
> +static void apu_rpmsg_remove(struct rpmsg_device *rpdev)
> +{
> +	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
> +
> +	if (apu_iovad)
> +		kref_put(&apu_iovad->refcount, iova_domain_release);
> +
> +	device_del(&apu->dev);
> +	put_device(&apu->dev);
> +	kfree(apu);
> +}
> +
> +static const struct rpmsg_device_id apu_rpmsg_match[] = {
> +	{ APU_RPMSG_SERVICE_MT8183 },
> +	{}
> +};
> +
> +static struct rpmsg_driver apu_rpmsg_driver = {
> +	.probe = apu_rpmsg_probe,
> +	.remove = apu_rpmsg_remove,
> +	.callback = apu_rpmsg_callback,
> +	.id_table = apu_rpmsg_match,
> +	.drv  = {
> +		.name  = "apu_rpmsg",
> +	},
> +};
> +
> +static int __init apu_rpmsg_init(void)
> +{
> +	int ret;
> +
> +	ret = alloc_chrdev_region(&rpmsg_major, 0, APU_DEV_MAX, "apu");
> +	if (ret < 0) {
> +		pr_err("apu: failed to allocate char dev region\n");
> +		return ret;
> +	}
> +
> +	return register_rpmsg_driver(&apu_rpmsg_driver);
> +}
> +arch_initcall(apu_rpmsg_init);
> +
> +static void __exit apu_rpmsg_exit(void)
> +{
> +	unregister_rpmsg_driver(&apu_rpmsg_driver);
> +}
> +module_exit(apu_rpmsg_exit);
> +
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("APU RPMSG driver");
> diff --git a/drivers/rpmsg/apu_rpmsg.h b/drivers/rpmsg/apu_rpmsg.h
> new file mode 100644
> index 000000000000..54b5b7880750
> --- /dev/null
> +++ b/drivers/rpmsg/apu_rpmsg.h
> @@ -0,0 +1,52 @@
> +/* SPDX-License-Identifier: GPL-2.0
> + *
> + * Copyright 2020 BayLibre SAS
> + */
> +
> +#ifndef __APU_RPMSG_H__
> +#define __APU_RPMSG_H__
> +
> +/*
> + * Firmware request, must be aligned with the one defined in firmware.
> + * @id: Request id, used in the case of reply, to find the pending request
> + * @cmd: The command id to execute in the firmware
> + * @result: The result of the command executed on the firmware
> + * @size: The size of the data available in this request
> + * @count: The number of shared buffer
> + * @data: Contains the data attached with the request if size is greater than
> + *        zero, and the addresses of shared buffers if count is greater than
> + *        zero. Both the data and the shared buffer could be read and write
> + *        by the APU.
> + */
> +struct  apu_dev_request {
> +	u16 id;
> +	u16 cmd;
> +	u16 result;
> +	u16 size_in;
> +	u16 size_out;
> +	u16 count;
> +	u8 data[0];
> +} __packed;
> +
> +#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
> +#define APU_CTRL_SRC 1
> +#define APU_CTRL_DST 1
> +
> +/* Vendor specific resource table entry */
> +#define RSC_VENDOR_IOVA 128
> +
> +/*
> + * Firmware IOVA resource table entry
> + * Define a range of virtual device address that could mapped using the IOMMU.
> + * @da: Start virtual device address
> + * @len: Length of the virtual device address
> + * @name: name of the resource
> + */
> +struct fw_rsc_iova {
> +	u32 da;
> +	u32 len;
> +	u32 reserved;
> +	u8 name[32];
> +} __packed;
> +
> +#endif /* __APU_RPMSG_H__ */
> diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
> new file mode 100644
> index 000000000000..81c9e4af9a94
> --- /dev/null
> +++ b/include/uapi/linux/apu_rpmsg.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2020 BayLibre
> + */
> +
> +#ifndef _UAPI_RPMSG_APU_H_
> +#define _UAPI_RPMSG_APU_H_
> +
> +#include <linux/ioctl.h>
> +#include <linux/types.h>
> +
> +/*
> + * Structure containing the APU request from userspace application
> + * @cmd: The id of the command to execute on the APU
> + * @result: The result of the command executed on the APU
> + * @size: The size of the data available in this request
> + * @count: The number of shared buffer
> + * @data: Contains the data attached with the request if size is greater than
> + *        zero, and the files descriptors of shared buffers if count is greater
> + *        than zero. Both the data and the shared buffer could be read and write
> + *        by the APU.
> + */
> +struct apu_request {
> +	__u16 cmd;
> +	__u16 result;
> +	__u16 size_in;
> +	__u16 size_out;
> +	__u16 count;
> +	__u16 reserved;
> +	__u8 data[0];
> +};
> +
> +/* Send synchronous request to an APU */
> +#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
> +
> +#endif
> -- 
> 2.26.2
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183
  2020-10-14 22:55     ` Mathieu Poirier
@ 2020-10-15 16:33       ` Mathieu Poirier
  -1 siblings, 0 replies; 22+ messages in thread
From: Mathieu Poirier @ 2020-10-15 16:33 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: linux-remoteproc, ohad, bjorn.andersson, sumit.semwal,
	christian.koenig, linux-kernel, linux-media, dri-devel,
	linaro-mm-sig, jstephan, stephane.leprovost, gpain, mturquette

On Wed, Oct 14, 2020 at 04:55:34PM -0600, Mathieu Poirier wrote:
> Hi Alexandre,
> 
> On Wed, Sep 30, 2020 at 01:53:47PM +0200, Alexandre Bailon wrote:
> > This adds a driver to communicate with the APU available
> > in the mt8183. The driver is generic and could be used for other APU.
> > It mostly provides a userspace interface to send messages and
> > and share big buffers with the APU.
> > 
> > Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
> > ---
> >  drivers/rpmsg/Kconfig          |   9 +
> >  drivers/rpmsg/Makefile         |   1 +
> >  drivers/rpmsg/apu_rpmsg.c      | 606 +++++++++++++++++++++++++++++++++
> >  drivers/rpmsg/apu_rpmsg.h      |  52 +++
> >  include/uapi/linux/apu_rpmsg.h |  36 ++
> >  5 files changed, 704 insertions(+)
> >  create mode 100644 drivers/rpmsg/apu_rpmsg.c
> >  create mode 100644 drivers/rpmsg/apu_rpmsg.h
> >  create mode 100644 include/uapi/linux/apu_rpmsg.h
> > 
> > diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
> > index f96716893c2a..3437c6fc8647 100644
> > --- a/drivers/rpmsg/Kconfig
> > +++ b/drivers/rpmsg/Kconfig
> > @@ -64,4 +64,13 @@ config RPMSG_VIRTIO
> >  	select RPMSG
> >  	select VIRTIO
> >  
> > +config RPMSG_APU
> > +	tristate "APU RPMSG driver"
> > +	help
> > +	  This provides a RPMSG driver that provides some facilities to
> > +	  communicate with an accelerated processing unit (APU).
> > +	  This creates one or more char files that could be used by userspace
> > +	  to send a message to an APU. In addition, this also take care of
> > +	  sharing the memory buffer with the APU.
> > +
> >  endmenu
> > diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
> > index ffe932ef6050..93e0f3de99c9 100644
> > --- a/drivers/rpmsg/Makefile
> > +++ b/drivers/rpmsg/Makefile
> > @@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
> >  obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
> >  obj-$(CONFIG_RPMSG_QCOM_SMD)	+= qcom_smd.o
> >  obj-$(CONFIG_RPMSG_VIRTIO)	+= virtio_rpmsg_bus.o
> > +obj-$(CONFIG_RPMSG_APU)		+= apu_rpmsg.o
> > diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
> > new file mode 100644
> > index 000000000000..5131b8b8e1f2
> > --- /dev/null
> > +++ b/drivers/rpmsg/apu_rpmsg.c
> > @@ -0,0 +1,606 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +//
> > +// Copyright 2020 BayLibre SAS
> > +
> > +#include <linux/cdev.h>
> > +#include <linux/dma-buf.h>
> > +#include <linux/iommu.h>
> > +#include <linux/iova.h>
> > +#include <linux/types.h>
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +#include <linux/remoteproc.h>
> > +#include <linux/rpmsg.h>
> > +#include <linux/of.h>
> > +#include <linux/platform_device.h>
> > +#include "rpmsg_internal.h"
> > +
> > +#include <uapi/linux/apu_rpmsg.h>
> > +
> > +#include "apu_rpmsg.h"
> > +
> > +/* Maximum of APU devices supported */
> > +#define APU_DEV_MAX 2
> > +
> > +#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
> > +#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
> > +
> > +struct rpmsg_apu {
> > +	struct rpmsg_device *rpdev;
> > +	struct cdev cdev;
> > +	struct device dev;
> > +
> > +	struct rproc *rproc;
> > +	struct iommu_domain *domain;
> > +	struct iova_domain *iovad;
> > +	int iova_limit_pfn;
> > +};
> > +
> > +struct rpmsg_request {
> > +	struct completion completion;
> > +	struct list_head node;
> > +	void *req;
> > +};
> > +
> > +struct apu_buffer {
> > +	int fd;
> > +	struct dma_buf *dma_buf;
> > +	struct dma_buf_attachment *attachment;
> > +	struct sg_table *sg_table;
> > +	u32 iova;
> > +};
> > +
> > +/*
> > + * Shared IOVA domain.
> > + * The MT8183 has two VP6 core but they are sharing the IOVA.
> > + * They could be used alone, or together. In order to avoid conflict,
> > + * create an IOVA domain that could be shared by those two core.
> > + * @iovad: The IOVA domain to share between the APU cores
> > + * @refcount: Allow to automatically release the IOVA domain once all the APU
> > + *            cores has been stopped
> > + */
> > +struct apu_iova_domain {
> > +	struct iova_domain iovad;
> > +	struct kref refcount;
> > +};
> > +
> > +static dev_t rpmsg_major;
> > +static DEFINE_IDA(rpmsg_ctrl_ida);
> > +static DEFINE_IDA(rpmsg_minor_ida);
> > +static DEFINE_IDA(req_ida);
> > +static LIST_HEAD(requests);
> > +static struct apu_iova_domain *apu_iovad;
> > +
> > +static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
> > +			      void *priv, u32 addr)
> > +{
> > +	struct rpmsg_request *rpmsg_req;
> > +	struct apu_dev_request *hdr = data;
> > +
> > +	list_for_each_entry(rpmsg_req, &requests, node) {
> > +		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
> > +
> > +		if (hdr->id == tmp_hdr->id) {
> > +			memcpy(rpmsg_req->req, data, count);
> > +			complete(&rpmsg_req->completion);
> > +
> > +			return 0;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int apu_device_memory_map(struct rpmsg_apu *apu,
> > +				 struct apu_buffer *buffer)
> > +{
> > +	struct rpmsg_device *rpdev = apu->rpdev;
> > +	phys_addr_t phys;
> > +	int total_buf_space;
> > +	int iova_pfn;
> > +	int ret;
> > +
> > +	if (!buffer->fd)
> > +		return 0;
> > +
> > +	buffer->dma_buf = dma_buf_get(buffer->fd);
> > +	if (IS_ERR(buffer->dma_buf)) {
> > +		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
> > +			PTR_ERR(buffer->dma_buf));
> > +		return PTR_ERR(buffer->dma_buf);
> > +	}
> > +
> > +	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
> > +	if (IS_ERR(buffer->attachment)) {
> > +		dev_err(&rpdev->dev, "Failed to attach dma_buf\n");
> > +		ret = PTR_ERR(buffer->attachment);
> > +		goto err_dma_buf_put;
> > +	}
> > +
> > +	buffer->sg_table = dma_buf_map_attachment(buffer->attachment,
> > +						   DMA_BIDIRECTIONAL);
> > +	if (IS_ERR(buffer->sg_table)) {
> > +		dev_err(&rpdev->dev, "Failed to map attachment\n");
> > +		ret = PTR_ERR(buffer->sg_table);
> > +		goto err_dma_buf_detach;
> > +	}
> > +	phys = page_to_phys(sg_page(buffer->sg_table->sgl));
> > +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
> > +
> > +	iova_pfn = alloc_iova_fast(apu->iovad, total_buf_space >> PAGE_SHIFT,
> > +				   apu->iova_limit_pfn, true);
> > +	if (!iova_pfn) {
> > +		dev_err(&rpdev->dev, "Failed to allocate iova address\n");
> > +		ret = -ENOMEM;
> > +		goto err_dma_unmap_attachment;
> > +	}
> > +
> > +	buffer->iova = PFN_PHYS(iova_pfn);
> > +	ret = iommu_map(apu->rproc->domain, buffer->iova, phys, total_buf_space,
> > +			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
> > +	if (ret) {
> > +		dev_err(&rpdev->dev, "Failed to iommu map\n");
> > +		goto err_free_iova;
> > +	}
> > +
> > +	return 0;
> > +
> > +err_free_iova:
> > +	free_iova(apu->iovad, iova_pfn);
> > +err_dma_unmap_attachment:
> > +	dma_buf_unmap_attachment(buffer->attachment,
> > +				 buffer->sg_table,
> > +				 DMA_BIDIRECTIONAL);
> > +err_dma_buf_detach:
> > +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
> > +err_dma_buf_put:
> > +	dma_buf_put(buffer->dma_buf);
> > +
> > +	return ret;
> > +}
> > +
> > +static void apu_device_memory_unmap(struct rpmsg_apu *apu,
> > +				    struct apu_buffer *buffer)
> > +{
> > +	int total_buf_space;
> > +
> > +	if (!buffer->fd)
> > +		return;
> > +
> > +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
> > +	iommu_unmap(apu->rproc->domain, buffer->iova, total_buf_space);
> > +	free_iova(apu->iovad, PHYS_PFN(buffer->iova));
> > +	dma_buf_unmap_attachment(buffer->attachment,
> > +				 buffer->sg_table,
> > +				 DMA_BIDIRECTIONAL);
> > +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
> > +	dma_buf_put(buffer->dma_buf);
> > +}
> > +
> > +static int _apu_send_request(struct rpmsg_apu *apu,
> > +			     struct rpmsg_device *rpdev,
> > +			     struct apu_dev_request *req, int len)
> > +{
> > +
> > +	struct rpmsg_request *rpmsg_req;
> > +	int ret = 0;
> > +
> > +	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
> > +	if (req->id < 0)
> > +		return ret;
> > +
> > +	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
> > +	if (!rpmsg_req)
> > +		return -ENOMEM;
> > +
> > +	rpmsg_req->req = req;
> > +	init_completion(&rpmsg_req->completion);
> > +	list_add(&rpmsg_req->node, &requests);
> > +
> > +	ret = rpmsg_send(rpdev->ept, req, len);
> > +	if (ret)
> > +		goto free_req;
> > +
> > +	/* be careful with race here between timeout and callback*/
> > +	ret = wait_for_completion_timeout(&rpmsg_req->completion,
> > +					  msecs_to_jiffies(1000));
> > +	if (!ret)
> > +		ret = -ETIMEDOUT;
> > +	else
> > +		ret = 0;
> > +
> > +	ida_simple_remove(&req_ida, req->id);
> > +
> > +free_req:
> > +
> > +	list_del(&rpmsg_req->node);
> > +	kfree(rpmsg_req);
> > +
> > +	return ret;
> > +}
> > +
> > +static int apu_send_request(struct rpmsg_apu *apu,
> > +			    struct apu_request *req)
> > +{
> > +	int ret;
> > +	struct rpmsg_device *rpdev = apu->rpdev;
> > +	struct apu_dev_request *dev_req;
> > +	struct apu_buffer *buffer;
> > +
> > +	int size = req->size_in + req->size_out +
> > +		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
> > +	u32 *fd = (u32 *)(req->data + req->size_in + req->size_out);
> > +	u32 *buffer_size = (u32 *)(fd + req->count);
> > +	u32 *dev_req_da;
> > +	u32 *dev_req_buffer_size;
> > +	int i;
> > +
> > +	dev_req = kmalloc(size, GFP_KERNEL);
> > +	if (!dev_req)
> > +		return -ENOMEM;
> > +
> > +	dev_req->cmd = req->cmd;
> > +	dev_req->size_in = req->size_in;
> > +	dev_req->size_out = req->size_out;
> > +	dev_req->count = req->count;
> > +	dev_req_da = (u32 *)(dev_req->data + req->size_in + req->size_out);
> 
> I have started to review this set but it will take me more time to wrap my head
> around what you are doing (the overall lack of comments in the code doesn't
> help).
> 
> In the mean time the "dev_req->data" above is very puzzling to me - did you mean
> to write "req-data"?  Otherwise I don't know how this can work since
> dev_req->data is not initalised after the kmalloc(). 

I haven't received an answer to the above question nor any feedback from the
comments I made on your previous set.  As such I will halt the revision of
this set until I hear back from you.

> 
> More comments will come tomorrow.
> 
> Thanks,
> Mathieu
> 
> > +	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
> > +	memcpy(dev_req->data, req->data, req->size_in);
> > +
> > +	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
> > +	for (i = 0; i < req->count; i++) {
> > +		buffer[i].fd = fd[i];
> > +		ret = apu_device_memory_map(apu, &buffer[i]);
> > +		if (ret)
> > +			goto err_free_memory;
> > +		dev_req_da[i] = buffer[i].iova;
> > +		dev_req_buffer_size[i] = buffer_size[i];
> > +	}
> > +
> > +	ret = _apu_send_request(apu, rpdev, dev_req, size);
> > +
> > +err_free_memory:
> > +	for (i--; i >= 0; i--)
> > +		apu_device_memory_unmap(apu, &buffer[i]);
> > +
> > +	req->result = dev_req->result;
> > +	req->size_in = dev_req->size_in;
> > +	req->size_out = dev_req->size_out;
> > +	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
> > +	       sizeof(u32) * req->count);
> > +
> > +	kfree(buffer);
> > +	kfree(dev_req);
> > +
> > +	return ret;
> > +}
> > +
> > +
> > +static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
> > +			       unsigned long arg)
> > +{
> > +	struct rpmsg_apu *apu = fp->private_data;
> > +	struct apu_request apu_req;
> > +	struct apu_request *apu_req_full;
> > +	void __user *argp = (void __user *)arg;
> > +	int len;
> > +	int ret;
> > +
> > +	switch (cmd) {
> > +	case APU_SEND_REQ_IOCTL:
> > +		/* Get the header */
> > +		if (copy_from_user(&apu_req, argp,
> > +				   sizeof(apu_req)))
> > +			return -EFAULT;
> > +
> > +		len = sizeof(*apu_req_full) + apu_req.size_in +
> > +			apu_req.size_out + apu_req.count * sizeof(u32) * 2;
> > +		apu_req_full = kzalloc(len, GFP_KERNEL);
> > +		if (!apu_req_full)
> > +			return -ENOMEM;
> > +
> > +		/* Get the whole request */
> > +		if (copy_from_user(apu_req_full, argp, len)) {
> > +			kfree(apu_req_full);
> > +			return -EFAULT;
> > +		}
> > +
> > +		ret = apu_send_request(apu, apu_req_full);
> > +		if (ret) {
> > +			kfree(apu_req_full);
> > +			return ret;
> > +		}
> > +
> > +		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
> > +				 sizeof(u32) * apu_req_full->count +
> > +				 apu_req_full->size_in + apu_req_full->size_out))
> > +			ret = -EFAULT;
> > +
> > +		kfree(apu_req_full);
> > +		return ret;
> > +
> > +	default:
> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
> > +{
> > +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
> > +
> > +	get_device(&apu->dev);
> > +	filp->private_data = apu;
> > +
> > +	return 0;
> > +}
> > +
> > +static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
> > +{
> > +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
> > +
> > +	put_device(&apu->dev);
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct file_operations rpmsg_eptdev_fops = {
> > +	.owner = THIS_MODULE,
> > +	.open = rpmsg_eptdev_open,
> > +	.release = rpmsg_eptdev_release,
> > +	.unlocked_ioctl = rpmsg_eptdev_ioctl,
> > +	.compat_ioctl = rpmsg_eptdev_ioctl,
> > +};
> > +
> > +static void iova_domain_release(struct kref *ref)
> > +{
> > +	put_iova_domain(&apu_iovad->iovad);
> > +	kfree(apu_iovad);
> > +	apu_iovad = NULL;
> > +}
> > +
> > +static struct fw_rsc_iova *apu_find_rcs_iova(struct rpmsg_apu *apu)
> > +{
> > +	struct rproc *rproc = apu->rproc;
> > +	struct resource_table *table;
> > +	struct fw_rsc_iova *rsc;
> > +	int i;
> > +
> > +	table = rproc->table_ptr;
> > +	for (i = 0; i < table->num; i++) {
> > +		int offset = table->offset[i];
> > +		struct fw_rsc_hdr *hdr = (void *)table + offset;
> > +
> > +		switch (hdr->type) {
> > +		case RSC_VENDOR_IOVA:
> > +			rsc = (void *)hdr + sizeof(*hdr);
> > +				return rsc;
> > +			break;
> > +		default:
> > +			continue;
> > +		}
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +static int apu_reserve_iova(struct rpmsg_apu *apu, struct iova_domain *iovad)
> > +{
> > +	struct rproc *rproc = apu->rproc;
> > +	struct resource_table *table;
> > +	struct fw_rsc_carveout *rsc;
> > +	int i;
> > +
> > +	table = rproc->table_ptr;
> > +	for (i = 0; i < table->num; i++) {
> > +		int offset = table->offset[i];
> > +		struct fw_rsc_hdr *hdr = (void *)table + offset;
> > +
> > +		if (hdr->type == RSC_CARVEOUT) {
> > +			struct iova *iova;
> > +
> > +			rsc = (void *)hdr + sizeof(*hdr);
> > +			iova = reserve_iova(iovad, PHYS_PFN(rsc->da),
> > +					    PHYS_PFN(rsc->da + rsc->len));
> > +			if (!iova) {
> > +				dev_err(&apu->dev, "failed to reserve iova\n");
> > +				return -ENOMEM;
> > +			}
> > +			dev_dbg(&apu->dev, "Reserve: %x - %x\n",
> > +				rsc->da, rsc->da + rsc->len);
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int apu_init_iovad(struct rpmsg_apu *apu)
> > +{
> > +	struct fw_rsc_iova *rsc;
> > +
> > +	if (!apu->rproc->table_ptr) {
> > +		dev_err(&apu->dev,
> > +			"No resource_table: has the firmware been loaded ?\n");
> > +		return -ENODEV;
> > +	}
> > +
> > +	rsc = apu_find_rcs_iova(apu);
> > +	if (!rsc) {
> > +		dev_err(&apu->dev, "No iova range defined in resource_table\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	if (!apu_iovad) {
> > +		apu_iovad = kzalloc(sizeof(*apu_iovad), GFP_KERNEL);
> > +		if (!apu_iovad)
> > +			return -ENOMEM;
> > +
> > +		init_iova_domain(&apu_iovad->iovad, PAGE_SIZE,
> > +				 PHYS_PFN(rsc->da));
> > +		apu_reserve_iova(apu, &apu_iovad->iovad);
> > +		kref_init(&apu_iovad->refcount);
> > +	} else
> > +		kref_get(&apu_iovad->refcount);
> > +
> > +	apu->iovad = &apu_iovad->iovad;
> > +	apu->iova_limit_pfn = PHYS_PFN(rsc->da + rsc->len) - 1;
> > +
> > +	return 0;
> > +}
> > +
> > +static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
> > +{
> > +	/*
> > +	 * To work, the APU RPMsg driver need to get the rproc device.
> > +	 * Currently, we only use virtio so we could use that to find the
> > +	 * remoteproc parent.
> > +	 */
> > +	if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
> > +		dev_err(&rpdev->dev, "invalid rpmsg device\n");
> > +		return ERR_PTR(-EINVAL);
> > +	}
> > +
> > +	if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
> > +		dev_err(&rpdev->dev, "unsupported bus\n");
> > +		return ERR_PTR(-EINVAL);
> > +	}
> > +
> > +	return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
> > +}
> > +
> > +static void rpmsg_apu_release_device(struct device *dev)
> > +{
> > +	struct rpmsg_apu *apu = dev_to_apu(dev);
> > +
> > +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> > +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> > +	cdev_del(&apu->cdev);
> > +	kfree(apu);
> > +}
> > +
> > +static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
> > +{
> > +	struct rpmsg_apu *apu;
> > +	struct device *dev;
> > +	int ret;
> > +
> > +	apu = devm_kzalloc(&rpdev->dev, sizeof(*apu), GFP_KERNEL);
> > +	if (!apu)
> > +		return -ENOMEM;
> > +	apu->rpdev = rpdev;
> > +
> > +	apu->rproc = apu_get_rproc(rpdev);
> > +	if (IS_ERR_OR_NULL(apu->rproc))
> > +		return PTR_ERR(apu->rproc);
> > +
> > +	dev = &apu->dev;
> > +	device_initialize(dev);
> > +	dev->parent = &rpdev->dev;
> > +
> > +	cdev_init(&apu->cdev, &rpmsg_eptdev_fops);
> > +	apu->cdev.owner = THIS_MODULE;
> > +
> > +	ret = ida_simple_get(&rpmsg_minor_ida, 0, APU_DEV_MAX, GFP_KERNEL);
> > +	if (ret < 0)
> > +		goto free_apu;
> > +	dev->devt = MKDEV(MAJOR(rpmsg_major), ret);
> > +
> > +	ret = ida_simple_get(&rpmsg_ctrl_ida, 0, 0, GFP_KERNEL);
> > +	if (ret < 0)
> > +		goto free_minor_ida;
> > +	dev->id = ret;
> > +	dev_set_name(&apu->dev, "apu%d", ret);
> > +
> > +	ret = cdev_add(&apu->cdev, dev->devt, 1);
> > +	if (ret)
> > +		goto free_ctrl_ida;
> > +
> > +	/* We can now rely on the release function for cleanup */
> > +	dev->release = rpmsg_apu_release_device;
> > +
> > +	ret = device_add(dev);
> > +	if (ret) {
> > +		dev_err(&rpdev->dev, "device_add failed: %d\n", ret);
> > +		put_device(dev);
> > +	}
> > +
> > +	/* Make device dma capable by inheriting from parent's capabilities */
> > +	set_dma_ops(&rpdev->dev, get_dma_ops(apu->rproc->dev.parent));
> > +
> > +	ret = dma_coerce_mask_and_coherent(&rpdev->dev,
> > +					   dma_get_mask(apu->rproc->dev.parent));
> > +	if (ret)
> > +		goto err_put_device;
> > +
> > +	rpdev->dev.iommu_group = apu->rproc->dev.parent->iommu_group;
> > +
> > +	ret = apu_init_iovad(apu);
> > +
> > +	dev_set_drvdata(&rpdev->dev, apu);
> > +
> > +	return ret;
> > +
> > +err_put_device:
> > +	put_device(dev);
> > +free_ctrl_ida:
> > +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> > +free_minor_ida:
> > +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> > +free_apu:
> > +	put_device(dev);
> > +	kfree(apu);
> > +
> > +	return ret;
> > +}
> > +
> > +static void apu_rpmsg_remove(struct rpmsg_device *rpdev)
> > +{
> > +	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
> > +
> > +	if (apu_iovad)
> > +		kref_put(&apu_iovad->refcount, iova_domain_release);
> > +
> > +	device_del(&apu->dev);
> > +	put_device(&apu->dev);
> > +	kfree(apu);
> > +}
> > +
> > +static const struct rpmsg_device_id apu_rpmsg_match[] = {
> > +	{ APU_RPMSG_SERVICE_MT8183 },
> > +	{}
> > +};
> > +
> > +static struct rpmsg_driver apu_rpmsg_driver = {
> > +	.probe = apu_rpmsg_probe,
> > +	.remove = apu_rpmsg_remove,
> > +	.callback = apu_rpmsg_callback,
> > +	.id_table = apu_rpmsg_match,
> > +	.drv  = {
> > +		.name  = "apu_rpmsg",
> > +	},
> > +};
> > +
> > +static int __init apu_rpmsg_init(void)
> > +{
> > +	int ret;
> > +
> > +	ret = alloc_chrdev_region(&rpmsg_major, 0, APU_DEV_MAX, "apu");
> > +	if (ret < 0) {
> > +		pr_err("apu: failed to allocate char dev region\n");
> > +		return ret;
> > +	}
> > +
> > +	return register_rpmsg_driver(&apu_rpmsg_driver);
> > +}
> > +arch_initcall(apu_rpmsg_init);
> > +
> > +static void __exit apu_rpmsg_exit(void)
> > +{
> > +	unregister_rpmsg_driver(&apu_rpmsg_driver);
> > +}
> > +module_exit(apu_rpmsg_exit);
> > +
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_DESCRIPTION("APU RPMSG driver");
> > diff --git a/drivers/rpmsg/apu_rpmsg.h b/drivers/rpmsg/apu_rpmsg.h
> > new file mode 100644
> > index 000000000000..54b5b7880750
> > --- /dev/null
> > +++ b/drivers/rpmsg/apu_rpmsg.h
> > @@ -0,0 +1,52 @@
> > +/* SPDX-License-Identifier: GPL-2.0
> > + *
> > + * Copyright 2020 BayLibre SAS
> > + */
> > +
> > +#ifndef __APU_RPMSG_H__
> > +#define __APU_RPMSG_H__
> > +
> > +/*
> > + * Firmware request, must be aligned with the one defined in firmware.
> > + * @id: Request id, used in the case of reply, to find the pending request
> > + * @cmd: The command id to execute in the firmware
> > + * @result: The result of the command executed on the firmware
> > + * @size: The size of the data available in this request
> > + * @count: The number of shared buffer
> > + * @data: Contains the data attached with the request if size is greater than
> > + *        zero, and the addresses of shared buffers if count is greater than
> > + *        zero. Both the data and the shared buffer could be read and write
> > + *        by the APU.
> > + */
> > +struct  apu_dev_request {
> > +	u16 id;
> > +	u16 cmd;
> > +	u16 result;
> > +	u16 size_in;
> > +	u16 size_out;
> > +	u16 count;
> > +	u8 data[0];
> > +} __packed;
> > +
> > +#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
> > +#define APU_CTRL_SRC 1
> > +#define APU_CTRL_DST 1
> > +
> > +/* Vendor specific resource table entry */
> > +#define RSC_VENDOR_IOVA 128
> > +
> > +/*
> > + * Firmware IOVA resource table entry
> > + * Define a range of virtual device address that could mapped using the IOMMU.
> > + * @da: Start virtual device address
> > + * @len: Length of the virtual device address
> > + * @name: name of the resource
> > + */
> > +struct fw_rsc_iova {
> > +	u32 da;
> > +	u32 len;
> > +	u32 reserved;
> > +	u8 name[32];
> > +} __packed;
> > +
> > +#endif /* __APU_RPMSG_H__ */
> > diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
> > new file mode 100644
> > index 000000000000..81c9e4af9a94
> > --- /dev/null
> > +++ b/include/uapi/linux/apu_rpmsg.h
> > @@ -0,0 +1,36 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/*
> > + * Copyright (c) 2020 BayLibre
> > + */
> > +
> > +#ifndef _UAPI_RPMSG_APU_H_
> > +#define _UAPI_RPMSG_APU_H_
> > +
> > +#include <linux/ioctl.h>
> > +#include <linux/types.h>
> > +
> > +/*
> > + * Structure containing the APU request from userspace application
> > + * @cmd: The id of the command to execute on the APU
> > + * @result: The result of the command executed on the APU
> > + * @size: The size of the data available in this request
> > + * @count: The number of shared buffer
> > + * @data: Contains the data attached with the request if size is greater than
> > + *        zero, and the files descriptors of shared buffers if count is greater
> > + *        than zero. Both the data and the shared buffer could be read and write
> > + *        by the APU.
> > + */
> > +struct apu_request {
> > +	__u16 cmd;
> > +	__u16 result;
> > +	__u16 size_in;
> > +	__u16 size_out;
> > +	__u16 count;
> > +	__u16 reserved;
> > +	__u8 data[0];
> > +};
> > +
> > +/* Send synchronous request to an APU */
> > +#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
> > +
> > +#endif
> > -- 
> > 2.26.2
> > 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183
@ 2020-10-15 16:33       ` Mathieu Poirier
  0 siblings, 0 replies; 22+ messages in thread
From: Mathieu Poirier @ 2020-10-15 16:33 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: ohad, gpain, stephane.leprovost, jstephan, linux-remoteproc,
	linux-kernel, dri-devel, linaro-mm-sig, mturquette,
	bjorn.andersson, christian.koenig, linux-media

On Wed, Oct 14, 2020 at 04:55:34PM -0600, Mathieu Poirier wrote:
> Hi Alexandre,
> 
> On Wed, Sep 30, 2020 at 01:53:47PM +0200, Alexandre Bailon wrote:
> > This adds a driver to communicate with the APU available
> > in the mt8183. The driver is generic and could be used for other APU.
> > It mostly provides a userspace interface to send messages and
> > and share big buffers with the APU.
> > 
> > Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
> > ---
> >  drivers/rpmsg/Kconfig          |   9 +
> >  drivers/rpmsg/Makefile         |   1 +
> >  drivers/rpmsg/apu_rpmsg.c      | 606 +++++++++++++++++++++++++++++++++
> >  drivers/rpmsg/apu_rpmsg.h      |  52 +++
> >  include/uapi/linux/apu_rpmsg.h |  36 ++
> >  5 files changed, 704 insertions(+)
> >  create mode 100644 drivers/rpmsg/apu_rpmsg.c
> >  create mode 100644 drivers/rpmsg/apu_rpmsg.h
> >  create mode 100644 include/uapi/linux/apu_rpmsg.h
> > 
> > diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
> > index f96716893c2a..3437c6fc8647 100644
> > --- a/drivers/rpmsg/Kconfig
> > +++ b/drivers/rpmsg/Kconfig
> > @@ -64,4 +64,13 @@ config RPMSG_VIRTIO
> >  	select RPMSG
> >  	select VIRTIO
> >  
> > +config RPMSG_APU
> > +	tristate "APU RPMSG driver"
> > +	help
> > +	  This provides a RPMSG driver that provides some facilities to
> > +	  communicate with an accelerated processing unit (APU).
> > +	  This creates one or more char files that could be used by userspace
> > +	  to send a message to an APU. In addition, this also take care of
> > +	  sharing the memory buffer with the APU.
> > +
> >  endmenu
> > diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
> > index ffe932ef6050..93e0f3de99c9 100644
> > --- a/drivers/rpmsg/Makefile
> > +++ b/drivers/rpmsg/Makefile
> > @@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
> >  obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
> >  obj-$(CONFIG_RPMSG_QCOM_SMD)	+= qcom_smd.o
> >  obj-$(CONFIG_RPMSG_VIRTIO)	+= virtio_rpmsg_bus.o
> > +obj-$(CONFIG_RPMSG_APU)		+= apu_rpmsg.o
> > diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
> > new file mode 100644
> > index 000000000000..5131b8b8e1f2
> > --- /dev/null
> > +++ b/drivers/rpmsg/apu_rpmsg.c
> > @@ -0,0 +1,606 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +//
> > +// Copyright 2020 BayLibre SAS
> > +
> > +#include <linux/cdev.h>
> > +#include <linux/dma-buf.h>
> > +#include <linux/iommu.h>
> > +#include <linux/iova.h>
> > +#include <linux/types.h>
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +#include <linux/remoteproc.h>
> > +#include <linux/rpmsg.h>
> > +#include <linux/of.h>
> > +#include <linux/platform_device.h>
> > +#include "rpmsg_internal.h"
> > +
> > +#include <uapi/linux/apu_rpmsg.h>
> > +
> > +#include "apu_rpmsg.h"
> > +
> > +/* Maximum of APU devices supported */
> > +#define APU_DEV_MAX 2
> > +
> > +#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
> > +#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
> > +
> > +struct rpmsg_apu {
> > +	struct rpmsg_device *rpdev;
> > +	struct cdev cdev;
> > +	struct device dev;
> > +
> > +	struct rproc *rproc;
> > +	struct iommu_domain *domain;
> > +	struct iova_domain *iovad;
> > +	int iova_limit_pfn;
> > +};
> > +
> > +struct rpmsg_request {
> > +	struct completion completion;
> > +	struct list_head node;
> > +	void *req;
> > +};
> > +
> > +struct apu_buffer {
> > +	int fd;
> > +	struct dma_buf *dma_buf;
> > +	struct dma_buf_attachment *attachment;
> > +	struct sg_table *sg_table;
> > +	u32 iova;
> > +};
> > +
> > +/*
> > + * Shared IOVA domain.
> > + * The MT8183 has two VP6 core but they are sharing the IOVA.
> > + * They could be used alone, or together. In order to avoid conflict,
> > + * create an IOVA domain that could be shared by those two core.
> > + * @iovad: The IOVA domain to share between the APU cores
> > + * @refcount: Allow to automatically release the IOVA domain once all the APU
> > + *            cores has been stopped
> > + */
> > +struct apu_iova_domain {
> > +	struct iova_domain iovad;
> > +	struct kref refcount;
> > +};
> > +
> > +static dev_t rpmsg_major;
> > +static DEFINE_IDA(rpmsg_ctrl_ida);
> > +static DEFINE_IDA(rpmsg_minor_ida);
> > +static DEFINE_IDA(req_ida);
> > +static LIST_HEAD(requests);
> > +static struct apu_iova_domain *apu_iovad;
> > +
> > +static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
> > +			      void *priv, u32 addr)
> > +{
> > +	struct rpmsg_request *rpmsg_req;
> > +	struct apu_dev_request *hdr = data;
> > +
> > +	list_for_each_entry(rpmsg_req, &requests, node) {
> > +		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
> > +
> > +		if (hdr->id == tmp_hdr->id) {
> > +			memcpy(rpmsg_req->req, data, count);
> > +			complete(&rpmsg_req->completion);
> > +
> > +			return 0;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int apu_device_memory_map(struct rpmsg_apu *apu,
> > +				 struct apu_buffer *buffer)
> > +{
> > +	struct rpmsg_device *rpdev = apu->rpdev;
> > +	phys_addr_t phys;
> > +	int total_buf_space;
> > +	int iova_pfn;
> > +	int ret;
> > +
> > +	if (!buffer->fd)
> > +		return 0;
> > +
> > +	buffer->dma_buf = dma_buf_get(buffer->fd);
> > +	if (IS_ERR(buffer->dma_buf)) {
> > +		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
> > +			PTR_ERR(buffer->dma_buf));
> > +		return PTR_ERR(buffer->dma_buf);
> > +	}
> > +
> > +	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
> > +	if (IS_ERR(buffer->attachment)) {
> > +		dev_err(&rpdev->dev, "Failed to attach dma_buf\n");
> > +		ret = PTR_ERR(buffer->attachment);
> > +		goto err_dma_buf_put;
> > +	}
> > +
> > +	buffer->sg_table = dma_buf_map_attachment(buffer->attachment,
> > +						   DMA_BIDIRECTIONAL);
> > +	if (IS_ERR(buffer->sg_table)) {
> > +		dev_err(&rpdev->dev, "Failed to map attachment\n");
> > +		ret = PTR_ERR(buffer->sg_table);
> > +		goto err_dma_buf_detach;
> > +	}
> > +	phys = page_to_phys(sg_page(buffer->sg_table->sgl));
> > +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
> > +
> > +	iova_pfn = alloc_iova_fast(apu->iovad, total_buf_space >> PAGE_SHIFT,
> > +				   apu->iova_limit_pfn, true);
> > +	if (!iova_pfn) {
> > +		dev_err(&rpdev->dev, "Failed to allocate iova address\n");
> > +		ret = -ENOMEM;
> > +		goto err_dma_unmap_attachment;
> > +	}
> > +
> > +	buffer->iova = PFN_PHYS(iova_pfn);
> > +	ret = iommu_map(apu->rproc->domain, buffer->iova, phys, total_buf_space,
> > +			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
> > +	if (ret) {
> > +		dev_err(&rpdev->dev, "Failed to iommu map\n");
> > +		goto err_free_iova;
> > +	}
> > +
> > +	return 0;
> > +
> > +err_free_iova:
> > +	free_iova(apu->iovad, iova_pfn);
> > +err_dma_unmap_attachment:
> > +	dma_buf_unmap_attachment(buffer->attachment,
> > +				 buffer->sg_table,
> > +				 DMA_BIDIRECTIONAL);
> > +err_dma_buf_detach:
> > +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
> > +err_dma_buf_put:
> > +	dma_buf_put(buffer->dma_buf);
> > +
> > +	return ret;
> > +}
> > +
> > +static void apu_device_memory_unmap(struct rpmsg_apu *apu,
> > +				    struct apu_buffer *buffer)
> > +{
> > +	int total_buf_space;
> > +
> > +	if (!buffer->fd)
> > +		return;
> > +
> > +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
> > +	iommu_unmap(apu->rproc->domain, buffer->iova, total_buf_space);
> > +	free_iova(apu->iovad, PHYS_PFN(buffer->iova));
> > +	dma_buf_unmap_attachment(buffer->attachment,
> > +				 buffer->sg_table,
> > +				 DMA_BIDIRECTIONAL);
> > +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
> > +	dma_buf_put(buffer->dma_buf);
> > +}
> > +
> > +static int _apu_send_request(struct rpmsg_apu *apu,
> > +			     struct rpmsg_device *rpdev,
> > +			     struct apu_dev_request *req, int len)
> > +{
> > +
> > +	struct rpmsg_request *rpmsg_req;
> > +	int ret = 0;
> > +
> > +	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
> > +	if (req->id < 0)
> > +		return ret;
> > +
> > +	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
> > +	if (!rpmsg_req)
> > +		return -ENOMEM;
> > +
> > +	rpmsg_req->req = req;
> > +	init_completion(&rpmsg_req->completion);
> > +	list_add(&rpmsg_req->node, &requests);
> > +
> > +	ret = rpmsg_send(rpdev->ept, req, len);
> > +	if (ret)
> > +		goto free_req;
> > +
> > +	/* be careful with race here between timeout and callback*/
> > +	ret = wait_for_completion_timeout(&rpmsg_req->completion,
> > +					  msecs_to_jiffies(1000));
> > +	if (!ret)
> > +		ret = -ETIMEDOUT;
> > +	else
> > +		ret = 0;
> > +
> > +	ida_simple_remove(&req_ida, req->id);
> > +
> > +free_req:
> > +
> > +	list_del(&rpmsg_req->node);
> > +	kfree(rpmsg_req);
> > +
> > +	return ret;
> > +}
> > +
> > +static int apu_send_request(struct rpmsg_apu *apu,
> > +			    struct apu_request *req)
> > +{
> > +	int ret;
> > +	struct rpmsg_device *rpdev = apu->rpdev;
> > +	struct apu_dev_request *dev_req;
> > +	struct apu_buffer *buffer;
> > +
> > +	int size = req->size_in + req->size_out +
> > +		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
> > +	u32 *fd = (u32 *)(req->data + req->size_in + req->size_out);
> > +	u32 *buffer_size = (u32 *)(fd + req->count);
> > +	u32 *dev_req_da;
> > +	u32 *dev_req_buffer_size;
> > +	int i;
> > +
> > +	dev_req = kmalloc(size, GFP_KERNEL);
> > +	if (!dev_req)
> > +		return -ENOMEM;
> > +
> > +	dev_req->cmd = req->cmd;
> > +	dev_req->size_in = req->size_in;
> > +	dev_req->size_out = req->size_out;
> > +	dev_req->count = req->count;
> > +	dev_req_da = (u32 *)(dev_req->data + req->size_in + req->size_out);
> 
> I have started to review this set but it will take me more time to wrap my head
> around what you are doing (the overall lack of comments in the code doesn't
> help).
> 
> In the mean time the "dev_req->data" above is very puzzling to me - did you mean
> to write "req-data"?  Otherwise I don't know how this can work since
> dev_req->data is not initalised after the kmalloc(). 

I haven't received an answer to the above question nor any feedback from the
comments I made on your previous set.  As such I will halt the revision of
this set until I hear back from you.

> 
> More comments will come tomorrow.
> 
> Thanks,
> Mathieu
> 
> > +	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
> > +	memcpy(dev_req->data, req->data, req->size_in);
> > +
> > +	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
> > +	for (i = 0; i < req->count; i++) {
> > +		buffer[i].fd = fd[i];
> > +		ret = apu_device_memory_map(apu, &buffer[i]);
> > +		if (ret)
> > +			goto err_free_memory;
> > +		dev_req_da[i] = buffer[i].iova;
> > +		dev_req_buffer_size[i] = buffer_size[i];
> > +	}
> > +
> > +	ret = _apu_send_request(apu, rpdev, dev_req, size);
> > +
> > +err_free_memory:
> > +	for (i--; i >= 0; i--)
> > +		apu_device_memory_unmap(apu, &buffer[i]);
> > +
> > +	req->result = dev_req->result;
> > +	req->size_in = dev_req->size_in;
> > +	req->size_out = dev_req->size_out;
> > +	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
> > +	       sizeof(u32) * req->count);
> > +
> > +	kfree(buffer);
> > +	kfree(dev_req);
> > +
> > +	return ret;
> > +}
> > +
> > +
> > +static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
> > +			       unsigned long arg)
> > +{
> > +	struct rpmsg_apu *apu = fp->private_data;
> > +	struct apu_request apu_req;
> > +	struct apu_request *apu_req_full;
> > +	void __user *argp = (void __user *)arg;
> > +	int len;
> > +	int ret;
> > +
> > +	switch (cmd) {
> > +	case APU_SEND_REQ_IOCTL:
> > +		/* Get the header */
> > +		if (copy_from_user(&apu_req, argp,
> > +				   sizeof(apu_req)))
> > +			return -EFAULT;
> > +
> > +		len = sizeof(*apu_req_full) + apu_req.size_in +
> > +			apu_req.size_out + apu_req.count * sizeof(u32) * 2;
> > +		apu_req_full = kzalloc(len, GFP_KERNEL);
> > +		if (!apu_req_full)
> > +			return -ENOMEM;
> > +
> > +		/* Get the whole request */
> > +		if (copy_from_user(apu_req_full, argp, len)) {
> > +			kfree(apu_req_full);
> > +			return -EFAULT;
> > +		}
> > +
> > +		ret = apu_send_request(apu, apu_req_full);
> > +		if (ret) {
> > +			kfree(apu_req_full);
> > +			return ret;
> > +		}
> > +
> > +		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
> > +				 sizeof(u32) * apu_req_full->count +
> > +				 apu_req_full->size_in + apu_req_full->size_out))
> > +			ret = -EFAULT;
> > +
> > +		kfree(apu_req_full);
> > +		return ret;
> > +
> > +	default:
> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
> > +{
> > +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
> > +
> > +	get_device(&apu->dev);
> > +	filp->private_data = apu;
> > +
> > +	return 0;
> > +}
> > +
> > +static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
> > +{
> > +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
> > +
> > +	put_device(&apu->dev);
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct file_operations rpmsg_eptdev_fops = {
> > +	.owner = THIS_MODULE,
> > +	.open = rpmsg_eptdev_open,
> > +	.release = rpmsg_eptdev_release,
> > +	.unlocked_ioctl = rpmsg_eptdev_ioctl,
> > +	.compat_ioctl = rpmsg_eptdev_ioctl,
> > +};
> > +
> > +static void iova_domain_release(struct kref *ref)
> > +{
> > +	put_iova_domain(&apu_iovad->iovad);
> > +	kfree(apu_iovad);
> > +	apu_iovad = NULL;
> > +}
> > +
> > +static struct fw_rsc_iova *apu_find_rcs_iova(struct rpmsg_apu *apu)
> > +{
> > +	struct rproc *rproc = apu->rproc;
> > +	struct resource_table *table;
> > +	struct fw_rsc_iova *rsc;
> > +	int i;
> > +
> > +	table = rproc->table_ptr;
> > +	for (i = 0; i < table->num; i++) {
> > +		int offset = table->offset[i];
> > +		struct fw_rsc_hdr *hdr = (void *)table + offset;
> > +
> > +		switch (hdr->type) {
> > +		case RSC_VENDOR_IOVA:
> > +			rsc = (void *)hdr + sizeof(*hdr);
> > +				return rsc;
> > +			break;
> > +		default:
> > +			continue;
> > +		}
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +static int apu_reserve_iova(struct rpmsg_apu *apu, struct iova_domain *iovad)
> > +{
> > +	struct rproc *rproc = apu->rproc;
> > +	struct resource_table *table;
> > +	struct fw_rsc_carveout *rsc;
> > +	int i;
> > +
> > +	table = rproc->table_ptr;
> > +	for (i = 0; i < table->num; i++) {
> > +		int offset = table->offset[i];
> > +		struct fw_rsc_hdr *hdr = (void *)table + offset;
> > +
> > +		if (hdr->type == RSC_CARVEOUT) {
> > +			struct iova *iova;
> > +
> > +			rsc = (void *)hdr + sizeof(*hdr);
> > +			iova = reserve_iova(iovad, PHYS_PFN(rsc->da),
> > +					    PHYS_PFN(rsc->da + rsc->len));
> > +			if (!iova) {
> > +				dev_err(&apu->dev, "failed to reserve iova\n");
> > +				return -ENOMEM;
> > +			}
> > +			dev_dbg(&apu->dev, "Reserve: %x - %x\n",
> > +				rsc->da, rsc->da + rsc->len);
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int apu_init_iovad(struct rpmsg_apu *apu)
> > +{
> > +	struct fw_rsc_iova *rsc;
> > +
> > +	if (!apu->rproc->table_ptr) {
> > +		dev_err(&apu->dev,
> > +			"No resource_table: has the firmware been loaded ?\n");
> > +		return -ENODEV;
> > +	}
> > +
> > +	rsc = apu_find_rcs_iova(apu);
> > +	if (!rsc) {
> > +		dev_err(&apu->dev, "No iova range defined in resource_table\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	if (!apu_iovad) {
> > +		apu_iovad = kzalloc(sizeof(*apu_iovad), GFP_KERNEL);
> > +		if (!apu_iovad)
> > +			return -ENOMEM;
> > +
> > +		init_iova_domain(&apu_iovad->iovad, PAGE_SIZE,
> > +				 PHYS_PFN(rsc->da));
> > +		apu_reserve_iova(apu, &apu_iovad->iovad);
> > +		kref_init(&apu_iovad->refcount);
> > +	} else
> > +		kref_get(&apu_iovad->refcount);
> > +
> > +	apu->iovad = &apu_iovad->iovad;
> > +	apu->iova_limit_pfn = PHYS_PFN(rsc->da + rsc->len) - 1;
> > +
> > +	return 0;
> > +}
> > +
> > +static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
> > +{
> > +	/*
> > +	 * To work, the APU RPMsg driver need to get the rproc device.
> > +	 * Currently, we only use virtio so we could use that to find the
> > +	 * remoteproc parent.
> > +	 */
> > +	if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
> > +		dev_err(&rpdev->dev, "invalid rpmsg device\n");
> > +		return ERR_PTR(-EINVAL);
> > +	}
> > +
> > +	if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
> > +		dev_err(&rpdev->dev, "unsupported bus\n");
> > +		return ERR_PTR(-EINVAL);
> > +	}
> > +
> > +	return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
> > +}
> > +
> > +static void rpmsg_apu_release_device(struct device *dev)
> > +{
> > +	struct rpmsg_apu *apu = dev_to_apu(dev);
> > +
> > +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> > +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> > +	cdev_del(&apu->cdev);
> > +	kfree(apu);
> > +}
> > +
> > +static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
> > +{
> > +	struct rpmsg_apu *apu;
> > +	struct device *dev;
> > +	int ret;
> > +
> > +	apu = devm_kzalloc(&rpdev->dev, sizeof(*apu), GFP_KERNEL);
> > +	if (!apu)
> > +		return -ENOMEM;
> > +	apu->rpdev = rpdev;
> > +
> > +	apu->rproc = apu_get_rproc(rpdev);
> > +	if (IS_ERR_OR_NULL(apu->rproc))
> > +		return PTR_ERR(apu->rproc);
> > +
> > +	dev = &apu->dev;
> > +	device_initialize(dev);
> > +	dev->parent = &rpdev->dev;
> > +
> > +	cdev_init(&apu->cdev, &rpmsg_eptdev_fops);
> > +	apu->cdev.owner = THIS_MODULE;
> > +
> > +	ret = ida_simple_get(&rpmsg_minor_ida, 0, APU_DEV_MAX, GFP_KERNEL);
> > +	if (ret < 0)
> > +		goto free_apu;
> > +	dev->devt = MKDEV(MAJOR(rpmsg_major), ret);
> > +
> > +	ret = ida_simple_get(&rpmsg_ctrl_ida, 0, 0, GFP_KERNEL);
> > +	if (ret < 0)
> > +		goto free_minor_ida;
> > +	dev->id = ret;
> > +	dev_set_name(&apu->dev, "apu%d", ret);
> > +
> > +	ret = cdev_add(&apu->cdev, dev->devt, 1);
> > +	if (ret)
> > +		goto free_ctrl_ida;
> > +
> > +	/* We can now rely on the release function for cleanup */
> > +	dev->release = rpmsg_apu_release_device;
> > +
> > +	ret = device_add(dev);
> > +	if (ret) {
> > +		dev_err(&rpdev->dev, "device_add failed: %d\n", ret);
> > +		put_device(dev);
> > +	}
> > +
> > +	/* Make device dma capable by inheriting from parent's capabilities */
> > +	set_dma_ops(&rpdev->dev, get_dma_ops(apu->rproc->dev.parent));
> > +
> > +	ret = dma_coerce_mask_and_coherent(&rpdev->dev,
> > +					   dma_get_mask(apu->rproc->dev.parent));
> > +	if (ret)
> > +		goto err_put_device;
> > +
> > +	rpdev->dev.iommu_group = apu->rproc->dev.parent->iommu_group;
> > +
> > +	ret = apu_init_iovad(apu);
> > +
> > +	dev_set_drvdata(&rpdev->dev, apu);
> > +
> > +	return ret;
> > +
> > +err_put_device:
> > +	put_device(dev);
> > +free_ctrl_ida:
> > +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> > +free_minor_ida:
> > +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> > +free_apu:
> > +	put_device(dev);
> > +	kfree(apu);
> > +
> > +	return ret;
> > +}
> > +
> > +static void apu_rpmsg_remove(struct rpmsg_device *rpdev)
> > +{
> > +	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
> > +
> > +	if (apu_iovad)
> > +		kref_put(&apu_iovad->refcount, iova_domain_release);
> > +
> > +	device_del(&apu->dev);
> > +	put_device(&apu->dev);
> > +	kfree(apu);
> > +}
> > +
> > +static const struct rpmsg_device_id apu_rpmsg_match[] = {
> > +	{ APU_RPMSG_SERVICE_MT8183 },
> > +	{}
> > +};
> > +
> > +static struct rpmsg_driver apu_rpmsg_driver = {
> > +	.probe = apu_rpmsg_probe,
> > +	.remove = apu_rpmsg_remove,
> > +	.callback = apu_rpmsg_callback,
> > +	.id_table = apu_rpmsg_match,
> > +	.drv  = {
> > +		.name  = "apu_rpmsg",
> > +	},
> > +};
> > +
> > +static int __init apu_rpmsg_init(void)
> > +{
> > +	int ret;
> > +
> > +	ret = alloc_chrdev_region(&rpmsg_major, 0, APU_DEV_MAX, "apu");
> > +	if (ret < 0) {
> > +		pr_err("apu: failed to allocate char dev region\n");
> > +		return ret;
> > +	}
> > +
> > +	return register_rpmsg_driver(&apu_rpmsg_driver);
> > +}
> > +arch_initcall(apu_rpmsg_init);
> > +
> > +static void __exit apu_rpmsg_exit(void)
> > +{
> > +	unregister_rpmsg_driver(&apu_rpmsg_driver);
> > +}
> > +module_exit(apu_rpmsg_exit);
> > +
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_DESCRIPTION("APU RPMSG driver");
> > diff --git a/drivers/rpmsg/apu_rpmsg.h b/drivers/rpmsg/apu_rpmsg.h
> > new file mode 100644
> > index 000000000000..54b5b7880750
> > --- /dev/null
> > +++ b/drivers/rpmsg/apu_rpmsg.h
> > @@ -0,0 +1,52 @@
> > +/* SPDX-License-Identifier: GPL-2.0
> > + *
> > + * Copyright 2020 BayLibre SAS
> > + */
> > +
> > +#ifndef __APU_RPMSG_H__
> > +#define __APU_RPMSG_H__
> > +
> > +/*
> > + * Firmware request, must be aligned with the one defined in firmware.
> > + * @id: Request id, used in the case of reply, to find the pending request
> > + * @cmd: The command id to execute in the firmware
> > + * @result: The result of the command executed on the firmware
> > + * @size: The size of the data available in this request
> > + * @count: The number of shared buffer
> > + * @data: Contains the data attached with the request if size is greater than
> > + *        zero, and the addresses of shared buffers if count is greater than
> > + *        zero. Both the data and the shared buffer could be read and write
> > + *        by the APU.
> > + */
> > +struct  apu_dev_request {
> > +	u16 id;
> > +	u16 cmd;
> > +	u16 result;
> > +	u16 size_in;
> > +	u16 size_out;
> > +	u16 count;
> > +	u8 data[0];
> > +} __packed;
> > +
> > +#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
> > +#define APU_CTRL_SRC 1
> > +#define APU_CTRL_DST 1
> > +
> > +/* Vendor specific resource table entry */
> > +#define RSC_VENDOR_IOVA 128
> > +
> > +/*
> > + * Firmware IOVA resource table entry
> > + * Define a range of virtual device address that could mapped using the IOMMU.
> > + * @da: Start virtual device address
> > + * @len: Length of the virtual device address
> > + * @name: name of the resource
> > + */
> > +struct fw_rsc_iova {
> > +	u32 da;
> > +	u32 len;
> > +	u32 reserved;
> > +	u8 name[32];
> > +} __packed;
> > +
> > +#endif /* __APU_RPMSG_H__ */
> > diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
> > new file mode 100644
> > index 000000000000..81c9e4af9a94
> > --- /dev/null
> > +++ b/include/uapi/linux/apu_rpmsg.h
> > @@ -0,0 +1,36 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/*
> > + * Copyright (c) 2020 BayLibre
> > + */
> > +
> > +#ifndef _UAPI_RPMSG_APU_H_
> > +#define _UAPI_RPMSG_APU_H_
> > +
> > +#include <linux/ioctl.h>
> > +#include <linux/types.h>
> > +
> > +/*
> > + * Structure containing the APU request from userspace application
> > + * @cmd: The id of the command to execute on the APU
> > + * @result: The result of the command executed on the APU
> > + * @size: The size of the data available in this request
> > + * @count: The number of shared buffer
> > + * @data: Contains the data attached with the request if size is greater than
> > + *        zero, and the files descriptors of shared buffers if count is greater
> > + *        than zero. Both the data and the shared buffer could be read and write
> > + *        by the APU.
> > + */
> > +struct apu_request {
> > +	__u16 cmd;
> > +	__u16 result;
> > +	__u16 size_in;
> > +	__u16 size_out;
> > +	__u16 count;
> > +	__u16 reserved;
> > +	__u8 data[0];
> > +};
> > +
> > +/* Send synchronous request to an APU */
> > +#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
> > +
> > +#endif
> > -- 
> > 2.26.2
> > 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183
  2020-10-15 16:33       ` Mathieu Poirier
@ 2021-07-20  8:24         ` Alexandre Bailon
  -1 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2021-07-20  8:24 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-remoteproc, ohad, bjorn.andersson, sumit.semwal,
	christian.koenig, linux-kernel, linux-media, dri-devel,
	linaro-mm-sig, jstephan, stephane.leprovost, gpain, mturquette

Hi Mathieu,

> On Wed, Oct 14, 2020 at 04:55:34PM -0600, Mathieu Poirier wrote:
>> Hi Alexandre,
>>
>> On Wed, Sep 30, 2020 at 01:53:47PM +0200, Alexandre Bailon wrote:
>>> This adds a driver to communicate with the APU available
>>> in the mt8183. The driver is generic and could be used for other APU.
>>> It mostly provides a userspace interface to send messages and
>>> and share big buffers with the APU.
>>>
>>> Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
>>> ---
>>>   drivers/rpmsg/Kconfig          |   9 +
>>>   drivers/rpmsg/Makefile         |   1 +
>>>   drivers/rpmsg/apu_rpmsg.c      | 606 +++++++++++++++++++++++++++++++++
>>>   drivers/rpmsg/apu_rpmsg.h      |  52 +++
>>>   include/uapi/linux/apu_rpmsg.h |  36 ++
>>>   5 files changed, 704 insertions(+)
>>>   create mode 100644 drivers/rpmsg/apu_rpmsg.c
>>>   create mode 100644 drivers/rpmsg/apu_rpmsg.h
>>>   create mode 100644 include/uapi/linux/apu_rpmsg.h
>>>
>>> diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
>>> index f96716893c2a..3437c6fc8647 100644
>>> --- a/drivers/rpmsg/Kconfig
>>> +++ b/drivers/rpmsg/Kconfig
>>> @@ -64,4 +64,13 @@ config RPMSG_VIRTIO
>>>   	select RPMSG
>>>   	select VIRTIO
>>>   
>>> +config RPMSG_APU
>>> +	tristate "APU RPMSG driver"
>>> +	help
>>> +	  This provides a RPMSG driver that provides some facilities to
>>> +	  communicate with an accelerated processing unit (APU).
>>> +	  This creates one or more char files that could be used by userspace
>>> +	  to send a message to an APU. In addition, this also take care of
>>> +	  sharing the memory buffer with the APU.
>>> +
>>>   endmenu
>>> diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
>>> index ffe932ef6050..93e0f3de99c9 100644
>>> --- a/drivers/rpmsg/Makefile
>>> +++ b/drivers/rpmsg/Makefile
>>> @@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
>>>   obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
>>>   obj-$(CONFIG_RPMSG_QCOM_SMD)	+= qcom_smd.o
>>>   obj-$(CONFIG_RPMSG_VIRTIO)	+= virtio_rpmsg_bus.o
>>> +obj-$(CONFIG_RPMSG_APU)		+= apu_rpmsg.o
>>> diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
>>> new file mode 100644
>>> index 000000000000..5131b8b8e1f2
>>> --- /dev/null
>>> +++ b/drivers/rpmsg/apu_rpmsg.c
>>> @@ -0,0 +1,606 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +//
>>> +// Copyright 2020 BayLibre SAS
>>> +
>>> +#include <linux/cdev.h>
>>> +#include <linux/dma-buf.h>
>>> +#include <linux/iommu.h>
>>> +#include <linux/iova.h>
>>> +#include <linux/types.h>
>>> +#include <linux/module.h>
>>> +#include <linux/slab.h>
>>> +#include <linux/remoteproc.h>
>>> +#include <linux/rpmsg.h>
>>> +#include <linux/of.h>
>>> +#include <linux/platform_device.h>
>>> +#include "rpmsg_internal.h"
>>> +
>>> +#include <uapi/linux/apu_rpmsg.h>
>>> +
>>> +#include "apu_rpmsg.h"
>>> +
>>> +/* Maximum of APU devices supported */
>>> +#define APU_DEV_MAX 2
>>> +
>>> +#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
>>> +#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
>>> +
>>> +struct rpmsg_apu {
>>> +	struct rpmsg_device *rpdev;
>>> +	struct cdev cdev;
>>> +	struct device dev;
>>> +
>>> +	struct rproc *rproc;
>>> +	struct iommu_domain *domain;
>>> +	struct iova_domain *iovad;
>>> +	int iova_limit_pfn;
>>> +};
>>> +
>>> +struct rpmsg_request {
>>> +	struct completion completion;
>>> +	struct list_head node;
>>> +	void *req;
>>> +};
>>> +
>>> +struct apu_buffer {
>>> +	int fd;
>>> +	struct dma_buf *dma_buf;
>>> +	struct dma_buf_attachment *attachment;
>>> +	struct sg_table *sg_table;
>>> +	u32 iova;
>>> +};
>>> +
>>> +/*
>>> + * Shared IOVA domain.
>>> + * The MT8183 has two VP6 core but they are sharing the IOVA.
>>> + * They could be used alone, or together. In order to avoid conflict,
>>> + * create an IOVA domain that could be shared by those two core.
>>> + * @iovad: The IOVA domain to share between the APU cores
>>> + * @refcount: Allow to automatically release the IOVA domain once all the APU
>>> + *            cores has been stopped
>>> + */
>>> +struct apu_iova_domain {
>>> +	struct iova_domain iovad;
>>> +	struct kref refcount;
>>> +};
>>> +
>>> +static dev_t rpmsg_major;
>>> +static DEFINE_IDA(rpmsg_ctrl_ida);
>>> +static DEFINE_IDA(rpmsg_minor_ida);
>>> +static DEFINE_IDA(req_ida);
>>> +static LIST_HEAD(requests);
>>> +static struct apu_iova_domain *apu_iovad;
>>> +
>>> +static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
>>> +			      void *priv, u32 addr)
>>> +{
>>> +	struct rpmsg_request *rpmsg_req;
>>> +	struct apu_dev_request *hdr = data;
>>> +
>>> +	list_for_each_entry(rpmsg_req, &requests, node) {
>>> +		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
>>> +
>>> +		if (hdr->id == tmp_hdr->id) {
>>> +			memcpy(rpmsg_req->req, data, count);
>>> +			complete(&rpmsg_req->completion);
>>> +
>>> +			return 0;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int apu_device_memory_map(struct rpmsg_apu *apu,
>>> +				 struct apu_buffer *buffer)
>>> +{
>>> +	struct rpmsg_device *rpdev = apu->rpdev;
>>> +	phys_addr_t phys;
>>> +	int total_buf_space;
>>> +	int iova_pfn;
>>> +	int ret;
>>> +
>>> +	if (!buffer->fd)
>>> +		return 0;
>>> +
>>> +	buffer->dma_buf = dma_buf_get(buffer->fd);
>>> +	if (IS_ERR(buffer->dma_buf)) {
>>> +		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
>>> +			PTR_ERR(buffer->dma_buf));
>>> +		return PTR_ERR(buffer->dma_buf);
>>> +	}
>>> +
>>> +	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
>>> +	if (IS_ERR(buffer->attachment)) {
>>> +		dev_err(&rpdev->dev, "Failed to attach dma_buf\n");
>>> +		ret = PTR_ERR(buffer->attachment);
>>> +		goto err_dma_buf_put;
>>> +	}
>>> +
>>> +	buffer->sg_table = dma_buf_map_attachment(buffer->attachment,
>>> +						   DMA_BIDIRECTIONAL);
>>> +	if (IS_ERR(buffer->sg_table)) {
>>> +		dev_err(&rpdev->dev, "Failed to map attachment\n");
>>> +		ret = PTR_ERR(buffer->sg_table);
>>> +		goto err_dma_buf_detach;
>>> +	}
>>> +	phys = page_to_phys(sg_page(buffer->sg_table->sgl));
>>> +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
>>> +
>>> +	iova_pfn = alloc_iova_fast(apu->iovad, total_buf_space >> PAGE_SHIFT,
>>> +				   apu->iova_limit_pfn, true);
>>> +	if (!iova_pfn) {
>>> +		dev_err(&rpdev->dev, "Failed to allocate iova address\n");
>>> +		ret = -ENOMEM;
>>> +		goto err_dma_unmap_attachment;
>>> +	}
>>> +
>>> +	buffer->iova = PFN_PHYS(iova_pfn);
>>> +	ret = iommu_map(apu->rproc->domain, buffer->iova, phys, total_buf_space,
>>> +			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
>>> +	if (ret) {
>>> +		dev_err(&rpdev->dev, "Failed to iommu map\n");
>>> +		goto err_free_iova;
>>> +	}
>>> +
>>> +	return 0;
>>> +
>>> +err_free_iova:
>>> +	free_iova(apu->iovad, iova_pfn);
>>> +err_dma_unmap_attachment:
>>> +	dma_buf_unmap_attachment(buffer->attachment,
>>> +				 buffer->sg_table,
>>> +				 DMA_BIDIRECTIONAL);
>>> +err_dma_buf_detach:
>>> +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
>>> +err_dma_buf_put:
>>> +	dma_buf_put(buffer->dma_buf);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static void apu_device_memory_unmap(struct rpmsg_apu *apu,
>>> +				    struct apu_buffer *buffer)
>>> +{
>>> +	int total_buf_space;
>>> +
>>> +	if (!buffer->fd)
>>> +		return;
>>> +
>>> +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
>>> +	iommu_unmap(apu->rproc->domain, buffer->iova, total_buf_space);
>>> +	free_iova(apu->iovad, PHYS_PFN(buffer->iova));
>>> +	dma_buf_unmap_attachment(buffer->attachment,
>>> +				 buffer->sg_table,
>>> +				 DMA_BIDIRECTIONAL);
>>> +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
>>> +	dma_buf_put(buffer->dma_buf);
>>> +}
>>> +
>>> +static int _apu_send_request(struct rpmsg_apu *apu,
>>> +			     struct rpmsg_device *rpdev,
>>> +			     struct apu_dev_request *req, int len)
>>> +{
>>> +
>>> +	struct rpmsg_request *rpmsg_req;
>>> +	int ret = 0;
>>> +
>>> +	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
>>> +	if (req->id < 0)
>>> +		return ret;
>>> +
>>> +	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
>>> +	if (!rpmsg_req)
>>> +		return -ENOMEM;
>>> +
>>> +	rpmsg_req->req = req;
>>> +	init_completion(&rpmsg_req->completion);
>>> +	list_add(&rpmsg_req->node, &requests);
>>> +
>>> +	ret = rpmsg_send(rpdev->ept, req, len);
>>> +	if (ret)
>>> +		goto free_req;
>>> +
>>> +	/* be careful with race here between timeout and callback*/
>>> +	ret = wait_for_completion_timeout(&rpmsg_req->completion,
>>> +					  msecs_to_jiffies(1000));
>>> +	if (!ret)
>>> +		ret = -ETIMEDOUT;
>>> +	else
>>> +		ret = 0;
>>> +
>>> +	ida_simple_remove(&req_ida, req->id);
>>> +
>>> +free_req:
>>> +
>>> +	list_del(&rpmsg_req->node);
>>> +	kfree(rpmsg_req);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static int apu_send_request(struct rpmsg_apu *apu,
>>> +			    struct apu_request *req)
>>> +{
>>> +	int ret;
>>> +	struct rpmsg_device *rpdev = apu->rpdev;
>>> +	struct apu_dev_request *dev_req;
>>> +	struct apu_buffer *buffer;
>>> +
>>> +	int size = req->size_in + req->size_out +
>>> +		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
>>> +	u32 *fd = (u32 *)(req->data + req->size_in + req->size_out);
>>> +	u32 *buffer_size = (u32 *)(fd + req->count);
>>> +	u32 *dev_req_da;
>>> +	u32 *dev_req_buffer_size;
>>> +	int i;
>>> +
>>> +	dev_req = kmalloc(size, GFP_KERNEL);
>>> +	if (!dev_req)
>>> +		return -ENOMEM;
>>> +
>>> +	dev_req->cmd = req->cmd;
>>> +	dev_req->size_in = req->size_in;
>>> +	dev_req->size_out = req->size_out;
>>> +	dev_req->count = req->count;
>>> +	dev_req_da = (u32 *)(dev_req->data + req->size_in + req->size_out);
>> I have started to review this set but it will take me more time to wrap my head
>> around what you are doing (the overall lack of comments in the code doesn't
>> help).
>>
>> In the mean time the "dev_req->data" above is very puzzling to me - did you mean
>> to write "req-data"?  Otherwise I don't know how this can work since
>> dev_req->data is not initalised after the kmalloc().
Data is declared as an array in struct apu_dev_request so we don't have 
to initialize its address.
This is working correctly but I think this would deserve much more 
documentation.
I will do it for the rewritten driver.
> I haven't received an answer to the above question nor any feedback from the
> comments I made on your previous set.  As such I will halt the revision of
> this set until I hear back from you.
My apologize, I have been focused on the driver rewriting, to use DRM 
framework instead of
managing the memory directly in the driver.

Thanks,
Alexandre

>
>> More comments will come tomorrow.
>>
>> Thanks,
>> Mathieu
>>
>>> +	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
>>> +	memcpy(dev_req->data, req->data, req->size_in);
>>> +
>>> +	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
>>> +	for (i = 0; i < req->count; i++) {
>>> +		buffer[i].fd = fd[i];
>>> +		ret = apu_device_memory_map(apu, &buffer[i]);
>>> +		if (ret)
>>> +			goto err_free_memory;
>>> +		dev_req_da[i] = buffer[i].iova;
>>> +		dev_req_buffer_size[i] = buffer_size[i];
>>> +	}
>>> +
>>> +	ret = _apu_send_request(apu, rpdev, dev_req, size);
>>> +
>>> +err_free_memory:
>>> +	for (i--; i >= 0; i--)
>>> +		apu_device_memory_unmap(apu, &buffer[i]);
>>> +
>>> +	req->result = dev_req->result;
>>> +	req->size_in = dev_req->size_in;
>>> +	req->size_out = dev_req->size_out;
>>> +	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
>>> +	       sizeof(u32) * req->count);
>>> +
>>> +	kfree(buffer);
>>> +	kfree(dev_req);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +
>>> +static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
>>> +			       unsigned long arg)
>>> +{
>>> +	struct rpmsg_apu *apu = fp->private_data;
>>> +	struct apu_request apu_req;
>>> +	struct apu_request *apu_req_full;
>>> +	void __user *argp = (void __user *)arg;
>>> +	int len;
>>> +	int ret;
>>> +
>>> +	switch (cmd) {
>>> +	case APU_SEND_REQ_IOCTL:
>>> +		/* Get the header */
>>> +		if (copy_from_user(&apu_req, argp,
>>> +				   sizeof(apu_req)))
>>> +			return -EFAULT;
>>> +
>>> +		len = sizeof(*apu_req_full) + apu_req.size_in +
>>> +			apu_req.size_out + apu_req.count * sizeof(u32) * 2;
>>> +		apu_req_full = kzalloc(len, GFP_KERNEL);
>>> +		if (!apu_req_full)
>>> +			return -ENOMEM;
>>> +
>>> +		/* Get the whole request */
>>> +		if (copy_from_user(apu_req_full, argp, len)) {
>>> +			kfree(apu_req_full);
>>> +			return -EFAULT;
>>> +		}
>>> +
>>> +		ret = apu_send_request(apu, apu_req_full);
>>> +		if (ret) {
>>> +			kfree(apu_req_full);
>>> +			return ret;
>>> +		}
>>> +
>>> +		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
>>> +				 sizeof(u32) * apu_req_full->count +
>>> +				 apu_req_full->size_in + apu_req_full->size_out))
>>> +			ret = -EFAULT;
>>> +
>>> +		kfree(apu_req_full);
>>> +		return ret;
>>> +
>>> +	default:
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
>>> +{
>>> +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
>>> +
>>> +	get_device(&apu->dev);
>>> +	filp->private_data = apu;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
>>> +{
>>> +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
>>> +
>>> +	put_device(&apu->dev);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static const struct file_operations rpmsg_eptdev_fops = {
>>> +	.owner = THIS_MODULE,
>>> +	.open = rpmsg_eptdev_open,
>>> +	.release = rpmsg_eptdev_release,
>>> +	.unlocked_ioctl = rpmsg_eptdev_ioctl,
>>> +	.compat_ioctl = rpmsg_eptdev_ioctl,
>>> +};
>>> +
>>> +static void iova_domain_release(struct kref *ref)
>>> +{
>>> +	put_iova_domain(&apu_iovad->iovad);
>>> +	kfree(apu_iovad);
>>> +	apu_iovad = NULL;
>>> +}
>>> +
>>> +static struct fw_rsc_iova *apu_find_rcs_iova(struct rpmsg_apu *apu)
>>> +{
>>> +	struct rproc *rproc = apu->rproc;
>>> +	struct resource_table *table;
>>> +	struct fw_rsc_iova *rsc;
>>> +	int i;
>>> +
>>> +	table = rproc->table_ptr;
>>> +	for (i = 0; i < table->num; i++) {
>>> +		int offset = table->offset[i];
>>> +		struct fw_rsc_hdr *hdr = (void *)table + offset;
>>> +
>>> +		switch (hdr->type) {
>>> +		case RSC_VENDOR_IOVA:
>>> +			rsc = (void *)hdr + sizeof(*hdr);
>>> +				return rsc;
>>> +			break;
>>> +		default:
>>> +			continue;
>>> +		}
>>> +	}
>>> +
>>> +	return NULL;
>>> +}
>>> +
>>> +static int apu_reserve_iova(struct rpmsg_apu *apu, struct iova_domain *iovad)
>>> +{
>>> +	struct rproc *rproc = apu->rproc;
>>> +	struct resource_table *table;
>>> +	struct fw_rsc_carveout *rsc;
>>> +	int i;
>>> +
>>> +	table = rproc->table_ptr;
>>> +	for (i = 0; i < table->num; i++) {
>>> +		int offset = table->offset[i];
>>> +		struct fw_rsc_hdr *hdr = (void *)table + offset;
>>> +
>>> +		if (hdr->type == RSC_CARVEOUT) {
>>> +			struct iova *iova;
>>> +
>>> +			rsc = (void *)hdr + sizeof(*hdr);
>>> +			iova = reserve_iova(iovad, PHYS_PFN(rsc->da),
>>> +					    PHYS_PFN(rsc->da + rsc->len));
>>> +			if (!iova) {
>>> +				dev_err(&apu->dev, "failed to reserve iova\n");
>>> +				return -ENOMEM;
>>> +			}
>>> +			dev_dbg(&apu->dev, "Reserve: %x - %x\n",
>>> +				rsc->da, rsc->da + rsc->len);
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int apu_init_iovad(struct rpmsg_apu *apu)
>>> +{
>>> +	struct fw_rsc_iova *rsc;
>>> +
>>> +	if (!apu->rproc->table_ptr) {
>>> +		dev_err(&apu->dev,
>>> +			"No resource_table: has the firmware been loaded ?\n");
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	rsc = apu_find_rcs_iova(apu);
>>> +	if (!rsc) {
>>> +		dev_err(&apu->dev, "No iova range defined in resource_table\n");
>>> +		return -ENOMEM;
>>> +	}
>>> +
>>> +	if (!apu_iovad) {
>>> +		apu_iovad = kzalloc(sizeof(*apu_iovad), GFP_KERNEL);
>>> +		if (!apu_iovad)
>>> +			return -ENOMEM;
>>> +
>>> +		init_iova_domain(&apu_iovad->iovad, PAGE_SIZE,
>>> +				 PHYS_PFN(rsc->da));
>>> +		apu_reserve_iova(apu, &apu_iovad->iovad);
>>> +		kref_init(&apu_iovad->refcount);
>>> +	} else
>>> +		kref_get(&apu_iovad->refcount);
>>> +
>>> +	apu->iovad = &apu_iovad->iovad;
>>> +	apu->iova_limit_pfn = PHYS_PFN(rsc->da + rsc->len) - 1;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
>>> +{
>>> +	/*
>>> +	 * To work, the APU RPMsg driver need to get the rproc device.
>>> +	 * Currently, we only use virtio so we could use that to find the
>>> +	 * remoteproc parent.
>>> +	 */
>>> +	if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
>>> +		dev_err(&rpdev->dev, "invalid rpmsg device\n");
>>> +		return ERR_PTR(-EINVAL);
>>> +	}
>>> +
>>> +	if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
>>> +		dev_err(&rpdev->dev, "unsupported bus\n");
>>> +		return ERR_PTR(-EINVAL);
>>> +	}
>>> +
>>> +	return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
>>> +}
>>> +
>>> +static void rpmsg_apu_release_device(struct device *dev)
>>> +{
>>> +	struct rpmsg_apu *apu = dev_to_apu(dev);
>>> +
>>> +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
>>> +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
>>> +	cdev_del(&apu->cdev);
>>> +	kfree(apu);
>>> +}
>>> +
>>> +static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
>>> +{
>>> +	struct rpmsg_apu *apu;
>>> +	struct device *dev;
>>> +	int ret;
>>> +
>>> +	apu = devm_kzalloc(&rpdev->dev, sizeof(*apu), GFP_KERNEL);
>>> +	if (!apu)
>>> +		return -ENOMEM;
>>> +	apu->rpdev = rpdev;
>>> +
>>> +	apu->rproc = apu_get_rproc(rpdev);
>>> +	if (IS_ERR_OR_NULL(apu->rproc))
>>> +		return PTR_ERR(apu->rproc);
>>> +
>>> +	dev = &apu->dev;
>>> +	device_initialize(dev);
>>> +	dev->parent = &rpdev->dev;
>>> +
>>> +	cdev_init(&apu->cdev, &rpmsg_eptdev_fops);
>>> +	apu->cdev.owner = THIS_MODULE;
>>> +
>>> +	ret = ida_simple_get(&rpmsg_minor_ida, 0, APU_DEV_MAX, GFP_KERNEL);
>>> +	if (ret < 0)
>>> +		goto free_apu;
>>> +	dev->devt = MKDEV(MAJOR(rpmsg_major), ret);
>>> +
>>> +	ret = ida_simple_get(&rpmsg_ctrl_ida, 0, 0, GFP_KERNEL);
>>> +	if (ret < 0)
>>> +		goto free_minor_ida;
>>> +	dev->id = ret;
>>> +	dev_set_name(&apu->dev, "apu%d", ret);
>>> +
>>> +	ret = cdev_add(&apu->cdev, dev->devt, 1);
>>> +	if (ret)
>>> +		goto free_ctrl_ida;
>>> +
>>> +	/* We can now rely on the release function for cleanup */
>>> +	dev->release = rpmsg_apu_release_device;
>>> +
>>> +	ret = device_add(dev);
>>> +	if (ret) {
>>> +		dev_err(&rpdev->dev, "device_add failed: %d\n", ret);
>>> +		put_device(dev);
>>> +	}
>>> +
>>> +	/* Make device dma capable by inheriting from parent's capabilities */
>>> +	set_dma_ops(&rpdev->dev, get_dma_ops(apu->rproc->dev.parent));
>>> +
>>> +	ret = dma_coerce_mask_and_coherent(&rpdev->dev,
>>> +					   dma_get_mask(apu->rproc->dev.parent));
>>> +	if (ret)
>>> +		goto err_put_device;
>>> +
>>> +	rpdev->dev.iommu_group = apu->rproc->dev.parent->iommu_group;
>>> +
>>> +	ret = apu_init_iovad(apu);
>>> +
>>> +	dev_set_drvdata(&rpdev->dev, apu);
>>> +
>>> +	return ret;
>>> +
>>> +err_put_device:
>>> +	put_device(dev);
>>> +free_ctrl_ida:
>>> +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
>>> +free_minor_ida:
>>> +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
>>> +free_apu:
>>> +	put_device(dev);
>>> +	kfree(apu);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static void apu_rpmsg_remove(struct rpmsg_device *rpdev)
>>> +{
>>> +	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
>>> +
>>> +	if (apu_iovad)
>>> +		kref_put(&apu_iovad->refcount, iova_domain_release);
>>> +
>>> +	device_del(&apu->dev);
>>> +	put_device(&apu->dev);
>>> +	kfree(apu);
>>> +}
>>> +
>>> +static const struct rpmsg_device_id apu_rpmsg_match[] = {
>>> +	{ APU_RPMSG_SERVICE_MT8183 },
>>> +	{}
>>> +};
>>> +
>>> +static struct rpmsg_driver apu_rpmsg_driver = {
>>> +	.probe = apu_rpmsg_probe,
>>> +	.remove = apu_rpmsg_remove,
>>> +	.callback = apu_rpmsg_callback,
>>> +	.id_table = apu_rpmsg_match,
>>> +	.drv  = {
>>> +		.name  = "apu_rpmsg",
>>> +	},
>>> +};
>>> +
>>> +static int __init apu_rpmsg_init(void)
>>> +{
>>> +	int ret;
>>> +
>>> +	ret = alloc_chrdev_region(&rpmsg_major, 0, APU_DEV_MAX, "apu");
>>> +	if (ret < 0) {
>>> +		pr_err("apu: failed to allocate char dev region\n");
>>> +		return ret;
>>> +	}
>>> +
>>> +	return register_rpmsg_driver(&apu_rpmsg_driver);
>>> +}
>>> +arch_initcall(apu_rpmsg_init);
>>> +
>>> +static void __exit apu_rpmsg_exit(void)
>>> +{
>>> +	unregister_rpmsg_driver(&apu_rpmsg_driver);
>>> +}
>>> +module_exit(apu_rpmsg_exit);
>>> +
>>> +
>>> +MODULE_LICENSE("GPL");
>>> +MODULE_DESCRIPTION("APU RPMSG driver");
>>> diff --git a/drivers/rpmsg/apu_rpmsg.h b/drivers/rpmsg/apu_rpmsg.h
>>> new file mode 100644
>>> index 000000000000..54b5b7880750
>>> --- /dev/null
>>> +++ b/drivers/rpmsg/apu_rpmsg.h
>>> @@ -0,0 +1,52 @@
>>> +/* SPDX-License-Identifier: GPL-2.0
>>> + *
>>> + * Copyright 2020 BayLibre SAS
>>> + */
>>> +
>>> +#ifndef __APU_RPMSG_H__
>>> +#define __APU_RPMSG_H__
>>> +
>>> +/*
>>> + * Firmware request, must be aligned with the one defined in firmware.
>>> + * @id: Request id, used in the case of reply, to find the pending request
>>> + * @cmd: The command id to execute in the firmware
>>> + * @result: The result of the command executed on the firmware
>>> + * @size: The size of the data available in this request
>>> + * @count: The number of shared buffer
>>> + * @data: Contains the data attached with the request if size is greater than
>>> + *        zero, and the addresses of shared buffers if count is greater than
>>> + *        zero. Both the data and the shared buffer could be read and write
>>> + *        by the APU.
>>> + */
>>> +struct  apu_dev_request {
>>> +	u16 id;
>>> +	u16 cmd;
>>> +	u16 result;
>>> +	u16 size_in;
>>> +	u16 size_out;
>>> +	u16 count;
>>> +	u8 data[0];
>>> +} __packed;
>>> +
>>> +#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
>>> +#define APU_CTRL_SRC 1
>>> +#define APU_CTRL_DST 1
>>> +
>>> +/* Vendor specific resource table entry */
>>> +#define RSC_VENDOR_IOVA 128
>>> +
>>> +/*
>>> + * Firmware IOVA resource table entry
>>> + * Define a range of virtual device address that could mapped using the IOMMU.
>>> + * @da: Start virtual device address
>>> + * @len: Length of the virtual device address
>>> + * @name: name of the resource
>>> + */
>>> +struct fw_rsc_iova {
>>> +	u32 da;
>>> +	u32 len;
>>> +	u32 reserved;
>>> +	u8 name[32];
>>> +} __packed;
>>> +
>>> +#endif /* __APU_RPMSG_H__ */
>>> diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
>>> new file mode 100644
>>> index 000000000000..81c9e4af9a94
>>> --- /dev/null
>>> +++ b/include/uapi/linux/apu_rpmsg.h
>>> @@ -0,0 +1,36 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>>> +/*
>>> + * Copyright (c) 2020 BayLibre
>>> + */
>>> +
>>> +#ifndef _UAPI_RPMSG_APU_H_
>>> +#define _UAPI_RPMSG_APU_H_
>>> +
>>> +#include <linux/ioctl.h>
>>> +#include <linux/types.h>
>>> +
>>> +/*
>>> + * Structure containing the APU request from userspace application
>>> + * @cmd: The id of the command to execute on the APU
>>> + * @result: The result of the command executed on the APU
>>> + * @size: The size of the data available in this request
>>> + * @count: The number of shared buffer
>>> + * @data: Contains the data attached with the request if size is greater than
>>> + *        zero, and the files descriptors of shared buffers if count is greater
>>> + *        than zero. Both the data and the shared buffer could be read and write
>>> + *        by the APU.
>>> + */
>>> +struct apu_request {
>>> +	__u16 cmd;
>>> +	__u16 result;
>>> +	__u16 size_in;
>>> +	__u16 size_out;
>>> +	__u16 count;
>>> +	__u16 reserved;
>>> +	__u8 data[0];
>>> +};
>>> +
>>> +/* Send synchronous request to an APU */
>>> +#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
>>> +
>>> +#endif
>>> -- 
>>> 2.26.2
>>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183
@ 2021-07-20  8:24         ` Alexandre Bailon
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandre Bailon @ 2021-07-20  8:24 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: ohad, gpain, stephane.leprovost, jstephan, linux-remoteproc,
	linux-kernel, dri-devel, linaro-mm-sig, mturquette,
	bjorn.andersson, christian.koenig, linux-media

Hi Mathieu,

> On Wed, Oct 14, 2020 at 04:55:34PM -0600, Mathieu Poirier wrote:
>> Hi Alexandre,
>>
>> On Wed, Sep 30, 2020 at 01:53:47PM +0200, Alexandre Bailon wrote:
>>> This adds a driver to communicate with the APU available
>>> in the mt8183. The driver is generic and could be used for other APU.
>>> It mostly provides a userspace interface to send messages and
>>> and share big buffers with the APU.
>>>
>>> Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
>>> ---
>>>   drivers/rpmsg/Kconfig          |   9 +
>>>   drivers/rpmsg/Makefile         |   1 +
>>>   drivers/rpmsg/apu_rpmsg.c      | 606 +++++++++++++++++++++++++++++++++
>>>   drivers/rpmsg/apu_rpmsg.h      |  52 +++
>>>   include/uapi/linux/apu_rpmsg.h |  36 ++
>>>   5 files changed, 704 insertions(+)
>>>   create mode 100644 drivers/rpmsg/apu_rpmsg.c
>>>   create mode 100644 drivers/rpmsg/apu_rpmsg.h
>>>   create mode 100644 include/uapi/linux/apu_rpmsg.h
>>>
>>> diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
>>> index f96716893c2a..3437c6fc8647 100644
>>> --- a/drivers/rpmsg/Kconfig
>>> +++ b/drivers/rpmsg/Kconfig
>>> @@ -64,4 +64,13 @@ config RPMSG_VIRTIO
>>>   	select RPMSG
>>>   	select VIRTIO
>>>   
>>> +config RPMSG_APU
>>> +	tristate "APU RPMSG driver"
>>> +	help
>>> +	  This provides a RPMSG driver that provides some facilities to
>>> +	  communicate with an accelerated processing unit (APU).
>>> +	  This creates one or more char files that could be used by userspace
>>> +	  to send a message to an APU. In addition, this also take care of
>>> +	  sharing the memory buffer with the APU.
>>> +
>>>   endmenu
>>> diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
>>> index ffe932ef6050..93e0f3de99c9 100644
>>> --- a/drivers/rpmsg/Makefile
>>> +++ b/drivers/rpmsg/Makefile
>>> @@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
>>>   obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
>>>   obj-$(CONFIG_RPMSG_QCOM_SMD)	+= qcom_smd.o
>>>   obj-$(CONFIG_RPMSG_VIRTIO)	+= virtio_rpmsg_bus.o
>>> +obj-$(CONFIG_RPMSG_APU)		+= apu_rpmsg.o
>>> diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
>>> new file mode 100644
>>> index 000000000000..5131b8b8e1f2
>>> --- /dev/null
>>> +++ b/drivers/rpmsg/apu_rpmsg.c
>>> @@ -0,0 +1,606 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +//
>>> +// Copyright 2020 BayLibre SAS
>>> +
>>> +#include <linux/cdev.h>
>>> +#include <linux/dma-buf.h>
>>> +#include <linux/iommu.h>
>>> +#include <linux/iova.h>
>>> +#include <linux/types.h>
>>> +#include <linux/module.h>
>>> +#include <linux/slab.h>
>>> +#include <linux/remoteproc.h>
>>> +#include <linux/rpmsg.h>
>>> +#include <linux/of.h>
>>> +#include <linux/platform_device.h>
>>> +#include "rpmsg_internal.h"
>>> +
>>> +#include <uapi/linux/apu_rpmsg.h>
>>> +
>>> +#include "apu_rpmsg.h"
>>> +
>>> +/* Maximum of APU devices supported */
>>> +#define APU_DEV_MAX 2
>>> +
>>> +#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
>>> +#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
>>> +
>>> +struct rpmsg_apu {
>>> +	struct rpmsg_device *rpdev;
>>> +	struct cdev cdev;
>>> +	struct device dev;
>>> +
>>> +	struct rproc *rproc;
>>> +	struct iommu_domain *domain;
>>> +	struct iova_domain *iovad;
>>> +	int iova_limit_pfn;
>>> +};
>>> +
>>> +struct rpmsg_request {
>>> +	struct completion completion;
>>> +	struct list_head node;
>>> +	void *req;
>>> +};
>>> +
>>> +struct apu_buffer {
>>> +	int fd;
>>> +	struct dma_buf *dma_buf;
>>> +	struct dma_buf_attachment *attachment;
>>> +	struct sg_table *sg_table;
>>> +	u32 iova;
>>> +};
>>> +
>>> +/*
>>> + * Shared IOVA domain.
>>> + * The MT8183 has two VP6 core but they are sharing the IOVA.
>>> + * They could be used alone, or together. In order to avoid conflict,
>>> + * create an IOVA domain that could be shared by those two core.
>>> + * @iovad: The IOVA domain to share between the APU cores
>>> + * @refcount: Allow to automatically release the IOVA domain once all the APU
>>> + *            cores has been stopped
>>> + */
>>> +struct apu_iova_domain {
>>> +	struct iova_domain iovad;
>>> +	struct kref refcount;
>>> +};
>>> +
>>> +static dev_t rpmsg_major;
>>> +static DEFINE_IDA(rpmsg_ctrl_ida);
>>> +static DEFINE_IDA(rpmsg_minor_ida);
>>> +static DEFINE_IDA(req_ida);
>>> +static LIST_HEAD(requests);
>>> +static struct apu_iova_domain *apu_iovad;
>>> +
>>> +static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int count,
>>> +			      void *priv, u32 addr)
>>> +{
>>> +	struct rpmsg_request *rpmsg_req;
>>> +	struct apu_dev_request *hdr = data;
>>> +
>>> +	list_for_each_entry(rpmsg_req, &requests, node) {
>>> +		struct apu_dev_request *tmp_hdr = rpmsg_req->req;
>>> +
>>> +		if (hdr->id == tmp_hdr->id) {
>>> +			memcpy(rpmsg_req->req, data, count);
>>> +			complete(&rpmsg_req->completion);
>>> +
>>> +			return 0;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int apu_device_memory_map(struct rpmsg_apu *apu,
>>> +				 struct apu_buffer *buffer)
>>> +{
>>> +	struct rpmsg_device *rpdev = apu->rpdev;
>>> +	phys_addr_t phys;
>>> +	int total_buf_space;
>>> +	int iova_pfn;
>>> +	int ret;
>>> +
>>> +	if (!buffer->fd)
>>> +		return 0;
>>> +
>>> +	buffer->dma_buf = dma_buf_get(buffer->fd);
>>> +	if (IS_ERR(buffer->dma_buf)) {
>>> +		dev_err(&rpdev->dev, "Failed to get dma_buf from fd: %ld\n",
>>> +			PTR_ERR(buffer->dma_buf));
>>> +		return PTR_ERR(buffer->dma_buf);
>>> +	}
>>> +
>>> +	buffer->attachment = dma_buf_attach(buffer->dma_buf, &rpdev->dev);
>>> +	if (IS_ERR(buffer->attachment)) {
>>> +		dev_err(&rpdev->dev, "Failed to attach dma_buf\n");
>>> +		ret = PTR_ERR(buffer->attachment);
>>> +		goto err_dma_buf_put;
>>> +	}
>>> +
>>> +	buffer->sg_table = dma_buf_map_attachment(buffer->attachment,
>>> +						   DMA_BIDIRECTIONAL);
>>> +	if (IS_ERR(buffer->sg_table)) {
>>> +		dev_err(&rpdev->dev, "Failed to map attachment\n");
>>> +		ret = PTR_ERR(buffer->sg_table);
>>> +		goto err_dma_buf_detach;
>>> +	}
>>> +	phys = page_to_phys(sg_page(buffer->sg_table->sgl));
>>> +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
>>> +
>>> +	iova_pfn = alloc_iova_fast(apu->iovad, total_buf_space >> PAGE_SHIFT,
>>> +				   apu->iova_limit_pfn, true);
>>> +	if (!iova_pfn) {
>>> +		dev_err(&rpdev->dev, "Failed to allocate iova address\n");
>>> +		ret = -ENOMEM;
>>> +		goto err_dma_unmap_attachment;
>>> +	}
>>> +
>>> +	buffer->iova = PFN_PHYS(iova_pfn);
>>> +	ret = iommu_map(apu->rproc->domain, buffer->iova, phys, total_buf_space,
>>> +			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
>>> +	if (ret) {
>>> +		dev_err(&rpdev->dev, "Failed to iommu map\n");
>>> +		goto err_free_iova;
>>> +	}
>>> +
>>> +	return 0;
>>> +
>>> +err_free_iova:
>>> +	free_iova(apu->iovad, iova_pfn);
>>> +err_dma_unmap_attachment:
>>> +	dma_buf_unmap_attachment(buffer->attachment,
>>> +				 buffer->sg_table,
>>> +				 DMA_BIDIRECTIONAL);
>>> +err_dma_buf_detach:
>>> +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
>>> +err_dma_buf_put:
>>> +	dma_buf_put(buffer->dma_buf);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static void apu_device_memory_unmap(struct rpmsg_apu *apu,
>>> +				    struct apu_buffer *buffer)
>>> +{
>>> +	int total_buf_space;
>>> +
>>> +	if (!buffer->fd)
>>> +		return;
>>> +
>>> +	total_buf_space = sg_dma_len(buffer->sg_table->sgl);
>>> +	iommu_unmap(apu->rproc->domain, buffer->iova, total_buf_space);
>>> +	free_iova(apu->iovad, PHYS_PFN(buffer->iova));
>>> +	dma_buf_unmap_attachment(buffer->attachment,
>>> +				 buffer->sg_table,
>>> +				 DMA_BIDIRECTIONAL);
>>> +	dma_buf_detach(buffer->dma_buf, buffer->attachment);
>>> +	dma_buf_put(buffer->dma_buf);
>>> +}
>>> +
>>> +static int _apu_send_request(struct rpmsg_apu *apu,
>>> +			     struct rpmsg_device *rpdev,
>>> +			     struct apu_dev_request *req, int len)
>>> +{
>>> +
>>> +	struct rpmsg_request *rpmsg_req;
>>> +	int ret = 0;
>>> +
>>> +	req->id = ida_simple_get(&req_ida, 0, 0xffff, GFP_KERNEL);
>>> +	if (req->id < 0)
>>> +		return ret;
>>> +
>>> +	rpmsg_req = kzalloc(sizeof(*rpmsg_req), GFP_KERNEL);
>>> +	if (!rpmsg_req)
>>> +		return -ENOMEM;
>>> +
>>> +	rpmsg_req->req = req;
>>> +	init_completion(&rpmsg_req->completion);
>>> +	list_add(&rpmsg_req->node, &requests);
>>> +
>>> +	ret = rpmsg_send(rpdev->ept, req, len);
>>> +	if (ret)
>>> +		goto free_req;
>>> +
>>> +	/* be careful with race here between timeout and callback*/
>>> +	ret = wait_for_completion_timeout(&rpmsg_req->completion,
>>> +					  msecs_to_jiffies(1000));
>>> +	if (!ret)
>>> +		ret = -ETIMEDOUT;
>>> +	else
>>> +		ret = 0;
>>> +
>>> +	ida_simple_remove(&req_ida, req->id);
>>> +
>>> +free_req:
>>> +
>>> +	list_del(&rpmsg_req->node);
>>> +	kfree(rpmsg_req);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static int apu_send_request(struct rpmsg_apu *apu,
>>> +			    struct apu_request *req)
>>> +{
>>> +	int ret;
>>> +	struct rpmsg_device *rpdev = apu->rpdev;
>>> +	struct apu_dev_request *dev_req;
>>> +	struct apu_buffer *buffer;
>>> +
>>> +	int size = req->size_in + req->size_out +
>>> +		sizeof(u32) * req->count * 2 + sizeof(*dev_req);
>>> +	u32 *fd = (u32 *)(req->data + req->size_in + req->size_out);
>>> +	u32 *buffer_size = (u32 *)(fd + req->count);
>>> +	u32 *dev_req_da;
>>> +	u32 *dev_req_buffer_size;
>>> +	int i;
>>> +
>>> +	dev_req = kmalloc(size, GFP_KERNEL);
>>> +	if (!dev_req)
>>> +		return -ENOMEM;
>>> +
>>> +	dev_req->cmd = req->cmd;
>>> +	dev_req->size_in = req->size_in;
>>> +	dev_req->size_out = req->size_out;
>>> +	dev_req->count = req->count;
>>> +	dev_req_da = (u32 *)(dev_req->data + req->size_in + req->size_out);
>> I have started to review this set but it will take me more time to wrap my head
>> around what you are doing (the overall lack of comments in the code doesn't
>> help).
>>
>> In the mean time the "dev_req->data" above is very puzzling to me - did you mean
>> to write "req-data"?  Otherwise I don't know how this can work since
>> dev_req->data is not initalised after the kmalloc().
Data is declared as an array in struct apu_dev_request so we don't have 
to initialize its address.
This is working correctly but I think this would deserve much more 
documentation.
I will do it for the rewritten driver.
> I haven't received an answer to the above question nor any feedback from the
> comments I made on your previous set.  As such I will halt the revision of
> this set until I hear back from you.
My apologize, I have been focused on the driver rewriting, to use DRM 
framework instead of
managing the memory directly in the driver.

Thanks,
Alexandre

>
>> More comments will come tomorrow.
>>
>> Thanks,
>> Mathieu
>>
>>> +	dev_req_buffer_size = (u32 *)(dev_req_da + dev_req->count);
>>> +	memcpy(dev_req->data, req->data, req->size_in);
>>> +
>>> +	buffer = kmalloc_array(req->count, sizeof(*buffer), GFP_KERNEL);
>>> +	for (i = 0; i < req->count; i++) {
>>> +		buffer[i].fd = fd[i];
>>> +		ret = apu_device_memory_map(apu, &buffer[i]);
>>> +		if (ret)
>>> +			goto err_free_memory;
>>> +		dev_req_da[i] = buffer[i].iova;
>>> +		dev_req_buffer_size[i] = buffer_size[i];
>>> +	}
>>> +
>>> +	ret = _apu_send_request(apu, rpdev, dev_req, size);
>>> +
>>> +err_free_memory:
>>> +	for (i--; i >= 0; i--)
>>> +		apu_device_memory_unmap(apu, &buffer[i]);
>>> +
>>> +	req->result = dev_req->result;
>>> +	req->size_in = dev_req->size_in;
>>> +	req->size_out = dev_req->size_out;
>>> +	memcpy(req->data, dev_req->data, dev_req->size_in + dev_req->size_out +
>>> +	       sizeof(u32) * req->count);
>>> +
>>> +	kfree(buffer);
>>> +	kfree(dev_req);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +
>>> +static long rpmsg_eptdev_ioctl(struct file *fp, unsigned int cmd,
>>> +			       unsigned long arg)
>>> +{
>>> +	struct rpmsg_apu *apu = fp->private_data;
>>> +	struct apu_request apu_req;
>>> +	struct apu_request *apu_req_full;
>>> +	void __user *argp = (void __user *)arg;
>>> +	int len;
>>> +	int ret;
>>> +
>>> +	switch (cmd) {
>>> +	case APU_SEND_REQ_IOCTL:
>>> +		/* Get the header */
>>> +		if (copy_from_user(&apu_req, argp,
>>> +				   sizeof(apu_req)))
>>> +			return -EFAULT;
>>> +
>>> +		len = sizeof(*apu_req_full) + apu_req.size_in +
>>> +			apu_req.size_out + apu_req.count * sizeof(u32) * 2;
>>> +		apu_req_full = kzalloc(len, GFP_KERNEL);
>>> +		if (!apu_req_full)
>>> +			return -ENOMEM;
>>> +
>>> +		/* Get the whole request */
>>> +		if (copy_from_user(apu_req_full, argp, len)) {
>>> +			kfree(apu_req_full);
>>> +			return -EFAULT;
>>> +		}
>>> +
>>> +		ret = apu_send_request(apu, apu_req_full);
>>> +		if (ret) {
>>> +			kfree(apu_req_full);
>>> +			return ret;
>>> +		}
>>> +
>>> +		if (copy_to_user(argp, apu_req_full, sizeof(apu_req) +
>>> +				 sizeof(u32) * apu_req_full->count +
>>> +				 apu_req_full->size_in + apu_req_full->size_out))
>>> +			ret = -EFAULT;
>>> +
>>> +		kfree(apu_req_full);
>>> +		return ret;
>>> +
>>> +	default:
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
>>> +{
>>> +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
>>> +
>>> +	get_device(&apu->dev);
>>> +	filp->private_data = apu;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
>>> +{
>>> +	struct rpmsg_apu *apu = cdev_to_apu(inode->i_cdev);
>>> +
>>> +	put_device(&apu->dev);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static const struct file_operations rpmsg_eptdev_fops = {
>>> +	.owner = THIS_MODULE,
>>> +	.open = rpmsg_eptdev_open,
>>> +	.release = rpmsg_eptdev_release,
>>> +	.unlocked_ioctl = rpmsg_eptdev_ioctl,
>>> +	.compat_ioctl = rpmsg_eptdev_ioctl,
>>> +};
>>> +
>>> +static void iova_domain_release(struct kref *ref)
>>> +{
>>> +	put_iova_domain(&apu_iovad->iovad);
>>> +	kfree(apu_iovad);
>>> +	apu_iovad = NULL;
>>> +}
>>> +
>>> +static struct fw_rsc_iova *apu_find_rcs_iova(struct rpmsg_apu *apu)
>>> +{
>>> +	struct rproc *rproc = apu->rproc;
>>> +	struct resource_table *table;
>>> +	struct fw_rsc_iova *rsc;
>>> +	int i;
>>> +
>>> +	table = rproc->table_ptr;
>>> +	for (i = 0; i < table->num; i++) {
>>> +		int offset = table->offset[i];
>>> +		struct fw_rsc_hdr *hdr = (void *)table + offset;
>>> +
>>> +		switch (hdr->type) {
>>> +		case RSC_VENDOR_IOVA:
>>> +			rsc = (void *)hdr + sizeof(*hdr);
>>> +				return rsc;
>>> +			break;
>>> +		default:
>>> +			continue;
>>> +		}
>>> +	}
>>> +
>>> +	return NULL;
>>> +}
>>> +
>>> +static int apu_reserve_iova(struct rpmsg_apu *apu, struct iova_domain *iovad)
>>> +{
>>> +	struct rproc *rproc = apu->rproc;
>>> +	struct resource_table *table;
>>> +	struct fw_rsc_carveout *rsc;
>>> +	int i;
>>> +
>>> +	table = rproc->table_ptr;
>>> +	for (i = 0; i < table->num; i++) {
>>> +		int offset = table->offset[i];
>>> +		struct fw_rsc_hdr *hdr = (void *)table + offset;
>>> +
>>> +		if (hdr->type == RSC_CARVEOUT) {
>>> +			struct iova *iova;
>>> +
>>> +			rsc = (void *)hdr + sizeof(*hdr);
>>> +			iova = reserve_iova(iovad, PHYS_PFN(rsc->da),
>>> +					    PHYS_PFN(rsc->da + rsc->len));
>>> +			if (!iova) {
>>> +				dev_err(&apu->dev, "failed to reserve iova\n");
>>> +				return -ENOMEM;
>>> +			}
>>> +			dev_dbg(&apu->dev, "Reserve: %x - %x\n",
>>> +				rsc->da, rsc->da + rsc->len);
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int apu_init_iovad(struct rpmsg_apu *apu)
>>> +{
>>> +	struct fw_rsc_iova *rsc;
>>> +
>>> +	if (!apu->rproc->table_ptr) {
>>> +		dev_err(&apu->dev,
>>> +			"No resource_table: has the firmware been loaded ?\n");
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	rsc = apu_find_rcs_iova(apu);
>>> +	if (!rsc) {
>>> +		dev_err(&apu->dev, "No iova range defined in resource_table\n");
>>> +		return -ENOMEM;
>>> +	}
>>> +
>>> +	if (!apu_iovad) {
>>> +		apu_iovad = kzalloc(sizeof(*apu_iovad), GFP_KERNEL);
>>> +		if (!apu_iovad)
>>> +			return -ENOMEM;
>>> +
>>> +		init_iova_domain(&apu_iovad->iovad, PAGE_SIZE,
>>> +				 PHYS_PFN(rsc->da));
>>> +		apu_reserve_iova(apu, &apu_iovad->iovad);
>>> +		kref_init(&apu_iovad->refcount);
>>> +	} else
>>> +		kref_get(&apu_iovad->refcount);
>>> +
>>> +	apu->iovad = &apu_iovad->iovad;
>>> +	apu->iova_limit_pfn = PHYS_PFN(rsc->da + rsc->len) - 1;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static struct rproc *apu_get_rproc(struct rpmsg_device *rpdev)
>>> +{
>>> +	/*
>>> +	 * To work, the APU RPMsg driver need to get the rproc device.
>>> +	 * Currently, we only use virtio so we could use that to find the
>>> +	 * remoteproc parent.
>>> +	 */
>>> +	if (!rpdev->dev.parent && rpdev->dev.parent->bus) {
>>> +		dev_err(&rpdev->dev, "invalid rpmsg device\n");
>>> +		return ERR_PTR(-EINVAL);
>>> +	}
>>> +
>>> +	if (strcmp(rpdev->dev.parent->bus->name, "virtio")) {
>>> +		dev_err(&rpdev->dev, "unsupported bus\n");
>>> +		return ERR_PTR(-EINVAL);
>>> +	}
>>> +
>>> +	return vdev_to_rproc(dev_to_virtio(rpdev->dev.parent));
>>> +}
>>> +
>>> +static void rpmsg_apu_release_device(struct device *dev)
>>> +{
>>> +	struct rpmsg_apu *apu = dev_to_apu(dev);
>>> +
>>> +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
>>> +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
>>> +	cdev_del(&apu->cdev);
>>> +	kfree(apu);
>>> +}
>>> +
>>> +static int apu_rpmsg_probe(struct rpmsg_device *rpdev)
>>> +{
>>> +	struct rpmsg_apu *apu;
>>> +	struct device *dev;
>>> +	int ret;
>>> +
>>> +	apu = devm_kzalloc(&rpdev->dev, sizeof(*apu), GFP_KERNEL);
>>> +	if (!apu)
>>> +		return -ENOMEM;
>>> +	apu->rpdev = rpdev;
>>> +
>>> +	apu->rproc = apu_get_rproc(rpdev);
>>> +	if (IS_ERR_OR_NULL(apu->rproc))
>>> +		return PTR_ERR(apu->rproc);
>>> +
>>> +	dev = &apu->dev;
>>> +	device_initialize(dev);
>>> +	dev->parent = &rpdev->dev;
>>> +
>>> +	cdev_init(&apu->cdev, &rpmsg_eptdev_fops);
>>> +	apu->cdev.owner = THIS_MODULE;
>>> +
>>> +	ret = ida_simple_get(&rpmsg_minor_ida, 0, APU_DEV_MAX, GFP_KERNEL);
>>> +	if (ret < 0)
>>> +		goto free_apu;
>>> +	dev->devt = MKDEV(MAJOR(rpmsg_major), ret);
>>> +
>>> +	ret = ida_simple_get(&rpmsg_ctrl_ida, 0, 0, GFP_KERNEL);
>>> +	if (ret < 0)
>>> +		goto free_minor_ida;
>>> +	dev->id = ret;
>>> +	dev_set_name(&apu->dev, "apu%d", ret);
>>> +
>>> +	ret = cdev_add(&apu->cdev, dev->devt, 1);
>>> +	if (ret)
>>> +		goto free_ctrl_ida;
>>> +
>>> +	/* We can now rely on the release function for cleanup */
>>> +	dev->release = rpmsg_apu_release_device;
>>> +
>>> +	ret = device_add(dev);
>>> +	if (ret) {
>>> +		dev_err(&rpdev->dev, "device_add failed: %d\n", ret);
>>> +		put_device(dev);
>>> +	}
>>> +
>>> +	/* Make device dma capable by inheriting from parent's capabilities */
>>> +	set_dma_ops(&rpdev->dev, get_dma_ops(apu->rproc->dev.parent));
>>> +
>>> +	ret = dma_coerce_mask_and_coherent(&rpdev->dev,
>>> +					   dma_get_mask(apu->rproc->dev.parent));
>>> +	if (ret)
>>> +		goto err_put_device;
>>> +
>>> +	rpdev->dev.iommu_group = apu->rproc->dev.parent->iommu_group;
>>> +
>>> +	ret = apu_init_iovad(apu);
>>> +
>>> +	dev_set_drvdata(&rpdev->dev, apu);
>>> +
>>> +	return ret;
>>> +
>>> +err_put_device:
>>> +	put_device(dev);
>>> +free_ctrl_ida:
>>> +	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
>>> +free_minor_ida:
>>> +	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
>>> +free_apu:
>>> +	put_device(dev);
>>> +	kfree(apu);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static void apu_rpmsg_remove(struct rpmsg_device *rpdev)
>>> +{
>>> +	struct rpmsg_apu *apu = dev_get_drvdata(&rpdev->dev);
>>> +
>>> +	if (apu_iovad)
>>> +		kref_put(&apu_iovad->refcount, iova_domain_release);
>>> +
>>> +	device_del(&apu->dev);
>>> +	put_device(&apu->dev);
>>> +	kfree(apu);
>>> +}
>>> +
>>> +static const struct rpmsg_device_id apu_rpmsg_match[] = {
>>> +	{ APU_RPMSG_SERVICE_MT8183 },
>>> +	{}
>>> +};
>>> +
>>> +static struct rpmsg_driver apu_rpmsg_driver = {
>>> +	.probe = apu_rpmsg_probe,
>>> +	.remove = apu_rpmsg_remove,
>>> +	.callback = apu_rpmsg_callback,
>>> +	.id_table = apu_rpmsg_match,
>>> +	.drv  = {
>>> +		.name  = "apu_rpmsg",
>>> +	},
>>> +};
>>> +
>>> +static int __init apu_rpmsg_init(void)
>>> +{
>>> +	int ret;
>>> +
>>> +	ret = alloc_chrdev_region(&rpmsg_major, 0, APU_DEV_MAX, "apu");
>>> +	if (ret < 0) {
>>> +		pr_err("apu: failed to allocate char dev region\n");
>>> +		return ret;
>>> +	}
>>> +
>>> +	return register_rpmsg_driver(&apu_rpmsg_driver);
>>> +}
>>> +arch_initcall(apu_rpmsg_init);
>>> +
>>> +static void __exit apu_rpmsg_exit(void)
>>> +{
>>> +	unregister_rpmsg_driver(&apu_rpmsg_driver);
>>> +}
>>> +module_exit(apu_rpmsg_exit);
>>> +
>>> +
>>> +MODULE_LICENSE("GPL");
>>> +MODULE_DESCRIPTION("APU RPMSG driver");
>>> diff --git a/drivers/rpmsg/apu_rpmsg.h b/drivers/rpmsg/apu_rpmsg.h
>>> new file mode 100644
>>> index 000000000000..54b5b7880750
>>> --- /dev/null
>>> +++ b/drivers/rpmsg/apu_rpmsg.h
>>> @@ -0,0 +1,52 @@
>>> +/* SPDX-License-Identifier: GPL-2.0
>>> + *
>>> + * Copyright 2020 BayLibre SAS
>>> + */
>>> +
>>> +#ifndef __APU_RPMSG_H__
>>> +#define __APU_RPMSG_H__
>>> +
>>> +/*
>>> + * Firmware request, must be aligned with the one defined in firmware.
>>> + * @id: Request id, used in the case of reply, to find the pending request
>>> + * @cmd: The command id to execute in the firmware
>>> + * @result: The result of the command executed on the firmware
>>> + * @size: The size of the data available in this request
>>> + * @count: The number of shared buffer
>>> + * @data: Contains the data attached with the request if size is greater than
>>> + *        zero, and the addresses of shared buffers if count is greater than
>>> + *        zero. Both the data and the shared buffer could be read and write
>>> + *        by the APU.
>>> + */
>>> +struct  apu_dev_request {
>>> +	u16 id;
>>> +	u16 cmd;
>>> +	u16 result;
>>> +	u16 size_in;
>>> +	u16 size_out;
>>> +	u16 count;
>>> +	u8 data[0];
>>> +} __packed;
>>> +
>>> +#define APU_RPMSG_SERVICE_MT8183 "rpmsg-mt8183-apu0"
>>> +#define APU_CTRL_SRC 1
>>> +#define APU_CTRL_DST 1
>>> +
>>> +/* Vendor specific resource table entry */
>>> +#define RSC_VENDOR_IOVA 128
>>> +
>>> +/*
>>> + * Firmware IOVA resource table entry
>>> + * Define a range of virtual device address that could mapped using the IOMMU.
>>> + * @da: Start virtual device address
>>> + * @len: Length of the virtual device address
>>> + * @name: name of the resource
>>> + */
>>> +struct fw_rsc_iova {
>>> +	u32 da;
>>> +	u32 len;
>>> +	u32 reserved;
>>> +	u8 name[32];
>>> +} __packed;
>>> +
>>> +#endif /* __APU_RPMSG_H__ */
>>> diff --git a/include/uapi/linux/apu_rpmsg.h b/include/uapi/linux/apu_rpmsg.h
>>> new file mode 100644
>>> index 000000000000..81c9e4af9a94
>>> --- /dev/null
>>> +++ b/include/uapi/linux/apu_rpmsg.h
>>> @@ -0,0 +1,36 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>>> +/*
>>> + * Copyright (c) 2020 BayLibre
>>> + */
>>> +
>>> +#ifndef _UAPI_RPMSG_APU_H_
>>> +#define _UAPI_RPMSG_APU_H_
>>> +
>>> +#include <linux/ioctl.h>
>>> +#include <linux/types.h>
>>> +
>>> +/*
>>> + * Structure containing the APU request from userspace application
>>> + * @cmd: The id of the command to execute on the APU
>>> + * @result: The result of the command executed on the APU
>>> + * @size: The size of the data available in this request
>>> + * @count: The number of shared buffer
>>> + * @data: Contains the data attached with the request if size is greater than
>>> + *        zero, and the files descriptors of shared buffers if count is greater
>>> + *        than zero. Both the data and the shared buffer could be read and write
>>> + *        by the APU.
>>> + */
>>> +struct apu_request {
>>> +	__u16 cmd;
>>> +	__u16 result;
>>> +	__u16 size_in;
>>> +	__u16 size_out;
>>> +	__u16 count;
>>> +	__u16 reserved;
>>> +	__u8 data[0];
>>> +};
>>> +
>>> +/* Send synchronous request to an APU */
>>> +#define APU_SEND_REQ_IOCTL	_IOWR(0xb7, 0x2, struct apu_request)
>>> +
>>> +#endif
>>> -- 
>>> 2.26.2
>>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-07-21  7:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-30 11:53 [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU) Alexandre Bailon
2020-09-30 11:53 ` Alexandre Bailon
2020-09-30 11:53 ` [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183 Alexandre Bailon
2020-09-30 11:53   ` Alexandre Bailon
2020-10-14 22:55   ` Mathieu Poirier
2020-10-14 22:55     ` Mathieu Poirier
2020-10-15 16:33     ` Mathieu Poirier
2020-10-15 16:33       ` Mathieu Poirier
2021-07-20  8:24       ` Alexandre Bailon
2021-07-20  8:24         ` Alexandre Bailon
2020-09-30 11:53 ` [RFC PATCH 2/4] rpmsg: apu_rpmsg: Add support for async apu request Alexandre Bailon
2020-09-30 11:53   ` Alexandre Bailon
2020-09-30 11:53 ` [RFC PATCH 3/4] rpmsg: apu_rpmsg: update the way to store IOMMU mapping Alexandre Bailon
2020-09-30 11:53   ` Alexandre Bailon
2020-09-30 11:53 ` [RFC PATCH 4/4] rpmsg: apu_rpmsg: Add an IOCTL to request " Alexandre Bailon
2020-09-30 11:53   ` Alexandre Bailon
2020-10-01  8:48 ` [RFC PATCH 0/4] Add a RPMsg driver to support AI Processing Unit (APU) Daniel Vetter
2020-10-01  8:48   ` Daniel Vetter
2020-10-01 17:28   ` Alexandre Bailon
2020-10-01 17:28     ` Alexandre Bailon
2020-10-02  9:35     ` Daniel Vetter
2020-10-02  9:35       ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.