linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
@ 2022-02-07 12:59 Paul Cercueil
  2022-02-07 12:59 ` [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue Paul Cercueil
                   ` (11 more replies)
  0 siblings, 12 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Hi Jonathan,

This is the V2 of my patchset that introduces a new userspace interface
based on DMABUF objects to complement the fileio API, and adds write()
support to the existing fileio API.

Changes since v1:

- the patches that were merged in v1 have been (obviously) dropped from
  this patchset;
- the patch that was setting the write-combine cache setting has been
  dropped as well, as it was simply not useful.
- [01/12]: 
    * Only remove the outgoing queue, and keep the incoming queue, as we
      want the buffer to start streaming data as soon as it is enabled.
    * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the
      same as IIO_BLOCK_STATE_DONE.
- [02/12]:
    * Fix block->state not being reset in
      iio_dma_buffer_request_update() for output buffers.
    * Only update block->bytes_used once and add a comment about why we
      update it.
    * Add a comment about why we're setting a different state for output
      buffers in iio_dma_buffer_request_update()
    * Remove useless cast to bool (!!) in iio_dma_buffer_io()
- [05/12]:
    Only allow the new IOCTLs on the buffer FD created with
    IIO_BUFFER_GET_FD_IOCTL().
- [12/12]:
    * Explicitly state that the new interface is optional and is
      not implemented by all drivers.
    * The IOCTLs can now only be called on the buffer FD returned by
      IIO_BUFFER_GET_FD_IOCTL.
    * Move the page up a bit in the index since it is core stuff and not
      driver-specific.

The patches not listed here have not been modified since v1.

Cheers,
-Paul

Alexandru Ardelean (1):
  iio: buffer-dma: split iio_dma_buffer_fileio_free() function

Paul Cercueil (11):
  iio: buffer-dma: Get rid of outgoing queue
  iio: buffer-dma: Enable buffer write support
  iio: buffer-dmaengine: Support specifying buffer direction
  iio: buffer-dmaengine: Enable write support
  iio: core: Add new DMABUF interface infrastructure
  iio: buffer-dma: Use DMABUFs instead of custom solution
  iio: buffer-dma: Implement new DMABUF based userspace API
  iio: buffer-dmaengine: Support new DMABUF based userspace API
  iio: core: Add support for cyclic buffers
  iio: buffer-dmaengine: Add support for cyclic buffers
  Documentation: iio: Document high-speed DMABUF based API

 Documentation/driver-api/dma-buf.rst          |   2 +
 Documentation/iio/dmabuf_api.rst              |  94 +++
 Documentation/iio/index.rst                   |   2 +
 drivers/iio/adc/adi-axi-adc.c                 |   3 +-
 drivers/iio/buffer/industrialio-buffer-dma.c  | 610 ++++++++++++++----
 .../buffer/industrialio-buffer-dmaengine.c    |  42 +-
 drivers/iio/industrialio-buffer.c             |  60 ++
 include/linux/iio/buffer-dma.h                |  38 +-
 include/linux/iio/buffer-dmaengine.h          |   5 +-
 include/linux/iio/buffer_impl.h               |   8 +
 include/uapi/linux/iio/buffer.h               |  30 +
 11 files changed, 749 insertions(+), 145 deletions(-)
 create mode 100644 Documentation/iio/dmabuf_api.rst

-- 
2.34.1


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-02-13 18:57   ` Jonathan Cameron
  2022-03-28 17:17   ` Jonathan Cameron
  2022-02-07 12:59 ` [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support Paul Cercueil
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

The buffer-dma code was using two queues, incoming and outgoing, to
manage the state of the blocks in use.

While this totally works, it adds some complexity to the code,
especially since the code only manages 2 blocks. It is much easier to
just check each block's state manually, and keep a counter for the next
block to dequeue.

Since the new DMABUF based API wouldn't use the outgoing queue anyway,
getting rid of it now makes the upcoming changes simpler.

With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can
be removed.

v2: - Only remove the outgoing queue, and keep the incoming queue, as we
      want the buffer to start streaming data as soon as it is enabled.
    - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the
      same as IIO_BLOCK_STATE_DONE.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 44 ++++++++++----------
 include/linux/iio/buffer-dma.h               |  7 ++--
 2 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
index d348af8b9705..1fc91467d1aa 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -179,7 +179,7 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
 	}
 
 	block->size = size;
-	block->state = IIO_BLOCK_STATE_DEQUEUED;
+	block->state = IIO_BLOCK_STATE_DONE;
 	block->queue = queue;
 	INIT_LIST_HEAD(&block->head);
 	kref_init(&block->kref);
@@ -191,16 +191,8 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
 
 static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
 {
-	struct iio_dma_buffer_queue *queue = block->queue;
-
-	/*
-	 * The buffer has already been freed by the application, just drop the
-	 * reference.
-	 */
-	if (block->state != IIO_BLOCK_STATE_DEAD) {
+	if (block->state != IIO_BLOCK_STATE_DEAD)
 		block->state = IIO_BLOCK_STATE_DONE;
-		list_add_tail(&block->head, &queue->outgoing);
-	}
 }
 
 /**
@@ -261,7 +253,6 @@ static bool iio_dma_block_reusable(struct iio_dma_buffer_block *block)
 	 * not support abort and has not given back the block yet.
 	 */
 	switch (block->state) {
-	case IIO_BLOCK_STATE_DEQUEUED:
 	case IIO_BLOCK_STATE_QUEUED:
 	case IIO_BLOCK_STATE_DONE:
 		return true;
@@ -317,7 +308,6 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
 	 * dead. This means we can reset the lists without having to fear
 	 * corrution.
 	 */
-	INIT_LIST_HEAD(&queue->outgoing);
 	spin_unlock_irq(&queue->list_lock);
 
 	INIT_LIST_HEAD(&queue->incoming);
@@ -456,14 +446,20 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue(
 	struct iio_dma_buffer_queue *queue)
 {
 	struct iio_dma_buffer_block *block;
+	unsigned int idx;
 
 	spin_lock_irq(&queue->list_lock);
-	block = list_first_entry_or_null(&queue->outgoing, struct
-		iio_dma_buffer_block, head);
-	if (block != NULL) {
-		list_del(&block->head);
-		block->state = IIO_BLOCK_STATE_DEQUEUED;
+
+	idx = queue->fileio.next_dequeue;
+	block = queue->fileio.blocks[idx];
+
+	if (block->state == IIO_BLOCK_STATE_DONE) {
+		idx = (idx + 1) % ARRAY_SIZE(queue->fileio.blocks);
+		queue->fileio.next_dequeue = idx;
+	} else {
+		block = NULL;
 	}
+
 	spin_unlock_irq(&queue->list_lock);
 
 	return block;
@@ -539,6 +535,7 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf)
 	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buf);
 	struct iio_dma_buffer_block *block;
 	size_t data_available = 0;
+	unsigned int i;
 
 	/*
 	 * For counting the available bytes we'll use the size of the block not
@@ -552,8 +549,15 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf)
 		data_available += queue->fileio.active_block->size;
 
 	spin_lock_irq(&queue->list_lock);
-	list_for_each_entry(block, &queue->outgoing, head)
-		data_available += block->size;
+
+	for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
+		block = queue->fileio.blocks[i];
+
+		if (block != queue->fileio.active_block
+		    && block->state == IIO_BLOCK_STATE_DONE)
+			data_available += block->size;
+	}
+
 	spin_unlock_irq(&queue->list_lock);
 	mutex_unlock(&queue->lock);
 
@@ -617,7 +621,6 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue,
 	queue->ops = ops;
 
 	INIT_LIST_HEAD(&queue->incoming);
-	INIT_LIST_HEAD(&queue->outgoing);
 
 	mutex_init(&queue->lock);
 	spin_lock_init(&queue->list_lock);
@@ -645,7 +648,6 @@ void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue)
 			continue;
 		queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD;
 	}
-	INIT_LIST_HEAD(&queue->outgoing);
 	spin_unlock_irq(&queue->list_lock);
 
 	INIT_LIST_HEAD(&queue->incoming);
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
index 6564bdcdac66..18d3702fa95d 100644
--- a/include/linux/iio/buffer-dma.h
+++ b/include/linux/iio/buffer-dma.h
@@ -19,14 +19,12 @@ struct device;
 
 /**
  * enum iio_block_state - State of a struct iio_dma_buffer_block
- * @IIO_BLOCK_STATE_DEQUEUED: Block is not queued
  * @IIO_BLOCK_STATE_QUEUED: Block is on the incoming queue
  * @IIO_BLOCK_STATE_ACTIVE: Block is currently being processed by the DMA
  * @IIO_BLOCK_STATE_DONE: Block is on the outgoing queue
  * @IIO_BLOCK_STATE_DEAD: Block has been marked as to be freed
  */
 enum iio_block_state {
-	IIO_BLOCK_STATE_DEQUEUED,
 	IIO_BLOCK_STATE_QUEUED,
 	IIO_BLOCK_STATE_ACTIVE,
 	IIO_BLOCK_STATE_DONE,
@@ -73,12 +71,15 @@ struct iio_dma_buffer_block {
  * @active_block: Block being used in read()
  * @pos: Read offset in the active block
  * @block_size: Size of each block
+ * @next_dequeue: index of next block that will be dequeued
  */
 struct iio_dma_buffer_queue_fileio {
 	struct iio_dma_buffer_block *blocks[2];
 	struct iio_dma_buffer_block *active_block;
 	size_t pos;
 	size_t block_size;
+
+	unsigned int next_dequeue;
 };
 
 /**
@@ -93,7 +94,6 @@ struct iio_dma_buffer_queue_fileio {
  *   list and typically also a list of active blocks in the part that handles
  *   the DMA controller
  * @incoming: List of buffers on the incoming queue
- * @outgoing: List of buffers on the outgoing queue
  * @active: Whether the buffer is currently active
  * @fileio: FileIO state
  */
@@ -105,7 +105,6 @@ struct iio_dma_buffer_queue {
 	struct mutex lock;
 	spinlock_t list_lock;
 	struct list_head incoming;
-	struct list_head outgoing;
 
 	bool active;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
  2022-02-07 12:59 ` [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-03-28 17:24   ` Jonathan Cameron
  2022-03-28 20:38   ` Andy Shevchenko
  2022-02-07 12:59 ` [PATCH v2 03/12] iio: buffer-dmaengine: Support specifying buffer direction Paul Cercueil
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Adding write support to the buffer-dma code is easy - the write()
function basically needs to do the exact same thing as the read()
function: dequeue a block, read or write the data, enqueue the block
when entirely processed.

Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write()
now both call a function iio_buffer_dma_io(), which will perform this
task.

The .space_available() callback can return the exact same value as the
.data_available() callback for input buffers, since in both cases we
count the exact same thing (the number of bytes in each available
block).

Note that we preemptively reset block->bytes_used to the buffer's size
in iio_dma_buffer_request_update(), as in the future the
iio_dma_buffer_enqueue() function won't reset it.

v2: - Fix block->state not being reset in
      iio_dma_buffer_request_update() for output buffers.
    - Only update block->bytes_used once and add a comment about why we
      update it.
    - Add a comment about why we're setting a different state for output
      buffers in iio_dma_buffer_request_update()
    - Remove useless cast to bool (!!) in iio_dma_buffer_io()

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 88 ++++++++++++++++----
 include/linux/iio/buffer-dma.h               |  7 ++
 2 files changed, 79 insertions(+), 16 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
index 1fc91467d1aa..a9f1b673374f 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -195,6 +195,18 @@ static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
 		block->state = IIO_BLOCK_STATE_DONE;
 }
 
+static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue *queue)
+{
+	__poll_t flags;
+
+	if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN)
+		flags = EPOLLIN | EPOLLRDNORM;
+	else
+		flags = EPOLLOUT | EPOLLWRNORM;
+
+	wake_up_interruptible_poll(&queue->buffer.pollq, flags);
+}
+
 /**
  * iio_dma_buffer_block_done() - Indicate that a block has been completed
  * @block: The completed block
@@ -212,7 +224,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
 	spin_unlock_irqrestore(&queue->list_lock, flags);
 
 	iio_buffer_block_put_atomic(block);
-	wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM);
+	iio_dma_buffer_queue_wake(queue);
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
 
@@ -241,7 +253,7 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue,
 	}
 	spin_unlock_irqrestore(&queue->list_lock, flags);
 
-	wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM);
+	iio_dma_buffer_queue_wake(queue);
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_block_list_abort);
 
@@ -335,8 +347,24 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
 			queue->fileio.blocks[i] = block;
 		}
 
-		block->state = IIO_BLOCK_STATE_QUEUED;
-		list_add_tail(&block->head, &queue->incoming);
+		/*
+		 * block->bytes_used may have been modified previously, e.g. by
+		 * iio_dma_buffer_block_list_abort(). Reset it here to the
+		 * block's so that iio_dma_buffer_io() will work.
+		 */
+		block->bytes_used = block->size;
+
+		/*
+		 * If it's an input buffer, mark the block as queued, and
+		 * iio_dma_buffer_enable() will submit it. Otherwise mark it as
+		 * done, which means it's ready to be dequeued.
+		 */
+		if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
+			block->state = IIO_BLOCK_STATE_QUEUED;
+			list_add_tail(&block->head, &queue->incoming);
+		} else {
+			block->state = IIO_BLOCK_STATE_DONE;
+		}
 	}
 
 out_unlock:
@@ -465,20 +493,12 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue(
 	return block;
 }
 
-/**
- * iio_dma_buffer_read() - DMA buffer read callback
- * @buffer: Buffer to read form
- * @n: Number of bytes to read
- * @user_buffer: Userspace buffer to copy the data to
- *
- * Should be used as the read callback for iio_buffer_access_ops
- * struct for DMA buffers.
- */
-int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
-	char __user *user_buffer)
+static int iio_dma_buffer_io(struct iio_buffer *buffer,
+			     size_t n, char __user *user_buffer, bool is_write)
 {
 	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
 	struct iio_dma_buffer_block *block;
+	void *addr;
 	int ret;
 
 	if (n < buffer->bytes_per_datum)
@@ -501,8 +521,13 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
 	n = rounddown(n, buffer->bytes_per_datum);
 	if (n > block->bytes_used - queue->fileio.pos)
 		n = block->bytes_used - queue->fileio.pos;
+	addr = block->vaddr + queue->fileio.pos;
 
-	if (copy_to_user(user_buffer, block->vaddr + queue->fileio.pos, n)) {
+	if (is_write)
+		ret = copy_from_user(addr, user_buffer, n);
+	else
+		ret = copy_to_user(user_buffer, addr, n);
+	if (ret) {
 		ret = -EFAULT;
 		goto out_unlock;
 	}
@@ -521,8 +546,39 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
 
 	return ret;
 }
+
+/**
+ * iio_dma_buffer_read() - DMA buffer read callback
+ * @buffer: Buffer to read form
+ * @n: Number of bytes to read
+ * @user_buffer: Userspace buffer to copy the data to
+ *
+ * Should be used as the read callback for iio_buffer_access_ops
+ * struct for DMA buffers.
+ */
+int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
+	char __user *user_buffer)
+{
+	return iio_dma_buffer_io(buffer, n, user_buffer, false);
+}
 EXPORT_SYMBOL_GPL(iio_dma_buffer_read);
 
+/**
+ * iio_dma_buffer_write() - DMA buffer write callback
+ * @buffer: Buffer to read form
+ * @n: Number of bytes to read
+ * @user_buffer: Userspace buffer to copy the data from
+ *
+ * Should be used as the write callback for iio_buffer_access_ops
+ * struct for DMA buffers.
+ */
+int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
+			 const char __user *user_buffer)
+{
+	return iio_dma_buffer_io(buffer, n, (__force char *)user_buffer, true);
+}
+EXPORT_SYMBOL_GPL(iio_dma_buffer_write);
+
 /**
  * iio_dma_buffer_data_available() - DMA buffer data_available callback
  * @buf: Buffer to check for data availability
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
index 18d3702fa95d..490b93f76fa8 100644
--- a/include/linux/iio/buffer-dma.h
+++ b/include/linux/iio/buffer-dma.h
@@ -132,6 +132,8 @@ int iio_dma_buffer_disable(struct iio_buffer *buffer,
 	struct iio_dev *indio_dev);
 int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
 	char __user *user_buffer);
+int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
+			 const char __user *user_buffer);
 size_t iio_dma_buffer_data_available(struct iio_buffer *buffer);
 int iio_dma_buffer_set_bytes_per_datum(struct iio_buffer *buffer, size_t bpd);
 int iio_dma_buffer_set_length(struct iio_buffer *buffer, unsigned int length);
@@ -142,4 +144,9 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue,
 void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue);
 void iio_dma_buffer_release(struct iio_dma_buffer_queue *queue);
 
+static inline size_t iio_dma_buffer_space_available(struct iio_buffer *buffer)
+{
+	return iio_dma_buffer_data_available(buffer);
+}
+
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 03/12] iio: buffer-dmaengine: Support specifying buffer direction
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
  2022-02-07 12:59 ` [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue Paul Cercueil
  2022-02-07 12:59 ` [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-02-07 12:59 ` [PATCH v2 04/12] iio: buffer-dmaengine: Enable write support Paul Cercueil
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Update the devm_iio_dmaengine_buffer_setup() function to support
specifying the buffer direction.

Update the iio_dmaengine_buffer_submit() function to handle input
buffers as well as output buffers.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
---
 drivers/iio/adc/adi-axi-adc.c                 |  3 ++-
 .../buffer/industrialio-buffer-dmaengine.c    | 24 +++++++++++++++----
 include/linux/iio/buffer-dmaengine.h          |  5 +++-
 3 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/iio/adc/adi-axi-adc.c b/drivers/iio/adc/adi-axi-adc.c
index a73e3c2d212f..0a6f2c32b1b9 100644
--- a/drivers/iio/adc/adi-axi-adc.c
+++ b/drivers/iio/adc/adi-axi-adc.c
@@ -113,7 +113,8 @@ static int adi_axi_adc_config_dma_buffer(struct device *dev,
 		dma_name = "rx";
 
 	return devm_iio_dmaengine_buffer_setup(indio_dev->dev.parent,
-					       indio_dev, dma_name);
+					       indio_dev, dma_name,
+					       IIO_BUFFER_DIRECTION_IN);
 }
 
 static int adi_axi_adc_read_raw(struct iio_dev *indio_dev,
diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
index f8ce26a24c57..ac26b04aa4a9 100644
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
+++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
@@ -64,14 +64,25 @@ static int iio_dmaengine_buffer_submit_block(struct iio_dma_buffer_queue *queue,
 	struct dmaengine_buffer *dmaengine_buffer =
 		iio_buffer_to_dmaengine_buffer(&queue->buffer);
 	struct dma_async_tx_descriptor *desc;
+	enum dma_transfer_direction dma_dir;
+	size_t max_size;
 	dma_cookie_t cookie;
 
-	block->bytes_used = min(block->size, dmaengine_buffer->max_size);
-	block->bytes_used = round_down(block->bytes_used,
-			dmaengine_buffer->align);
+	max_size = min(block->size, dmaengine_buffer->max_size);
+	max_size = round_down(max_size, dmaengine_buffer->align);
+
+	if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
+		block->bytes_used = max_size;
+		dma_dir = DMA_DEV_TO_MEM;
+	} else {
+		dma_dir = DMA_MEM_TO_DEV;
+	}
+
+	if (!block->bytes_used || block->bytes_used > max_size)
+		return -EINVAL;
 
 	desc = dmaengine_prep_slave_single(dmaengine_buffer->chan,
-		block->phys_addr, block->bytes_used, DMA_DEV_TO_MEM,
+		block->phys_addr, block->bytes_used, dma_dir,
 		DMA_PREP_INTERRUPT);
 	if (!desc)
 		return -ENOMEM;
@@ -275,7 +286,8 @@ static struct iio_buffer *devm_iio_dmaengine_buffer_alloc(struct device *dev,
  */
 int devm_iio_dmaengine_buffer_setup(struct device *dev,
 				    struct iio_dev *indio_dev,
-				    const char *channel)
+				    const char *channel,
+				    enum iio_buffer_direction dir)
 {
 	struct iio_buffer *buffer;
 
@@ -286,6 +298,8 @@ int devm_iio_dmaengine_buffer_setup(struct device *dev,
 
 	indio_dev->modes |= INDIO_BUFFER_HARDWARE;
 
+	buffer->direction = dir;
+
 	return iio_device_attach_buffer(indio_dev, buffer);
 }
 EXPORT_SYMBOL_GPL(devm_iio_dmaengine_buffer_setup);
diff --git a/include/linux/iio/buffer-dmaengine.h b/include/linux/iio/buffer-dmaengine.h
index 5c355be89814..538d0479cdd6 100644
--- a/include/linux/iio/buffer-dmaengine.h
+++ b/include/linux/iio/buffer-dmaengine.h
@@ -7,11 +7,14 @@
 #ifndef __IIO_DMAENGINE_H__
 #define __IIO_DMAENGINE_H__
 
+#include <linux/iio/buffer.h>
+
 struct iio_dev;
 struct device;
 
 int devm_iio_dmaengine_buffer_setup(struct device *dev,
 				    struct iio_dev *indio_dev,
-				    const char *channel);
+				    const char *channel,
+				    enum iio_buffer_direction dir);
 
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 04/12] iio: buffer-dmaengine: Enable write support
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (2 preceding siblings ...)
  2022-02-07 12:59 ` [PATCH v2 03/12] iio: buffer-dmaengine: Support specifying buffer direction Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-03-28 17:28   ` Jonathan Cameron
  2022-02-07 12:59 ` [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure Paul Cercueil
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Use the iio_dma_buffer_write() and iio_dma_buffer_space_available()
functions provided by the buffer-dma core, to enable write support in
the buffer-dmaengine code.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
---
 drivers/iio/buffer/industrialio-buffer-dmaengine.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
index ac26b04aa4a9..5cde8fd81c7f 100644
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
+++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
@@ -123,12 +123,14 @@ static void iio_dmaengine_buffer_release(struct iio_buffer *buf)
 
 static const struct iio_buffer_access_funcs iio_dmaengine_buffer_ops = {
 	.read = iio_dma_buffer_read,
+	.write = iio_dma_buffer_write,
 	.set_bytes_per_datum = iio_dma_buffer_set_bytes_per_datum,
 	.set_length = iio_dma_buffer_set_length,
 	.request_update = iio_dma_buffer_request_update,
 	.enable = iio_dma_buffer_enable,
 	.disable = iio_dma_buffer_disable,
 	.data_available = iio_dma_buffer_data_available,
+	.space_available = iio_dma_buffer_space_available,
 	.release = iio_dmaengine_buffer_release,
 
 	.modes = INDIO_BUFFER_HARDWARE,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (3 preceding siblings ...)
  2022-02-07 12:59 ` [PATCH v2 04/12] iio: buffer-dmaengine: Enable write support Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-03-28 17:37   ` Jonathan Cameron
  2022-03-28 20:46   ` Andy Shevchenko
  2022-02-07 12:59 ` [PATCH v2 06/12] iio: buffer-dma: split iio_dma_buffer_fileio_free() function Paul Cercueil
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Add the necessary infrastructure to the IIO core to support a new
optional DMABUF based interface.

The advantage of this new DMABUF based interface vs. the read()
interface, is that it avoids an extra copy of the data between the
kernel and userspace. This is particularly userful for high-speed
devices which produce several megabytes or even gigabytes of data per
second.

The data in this new DMABUF interface is managed at the granularity of
DMABUF objects. Reducing the granularity from byte level to block level
is done to reduce the userspace-kernelspace synchronization overhead
since performing syscalls for each byte at a few Mbps is just not
feasible.

This of course leads to a slightly increased latency. For this reason an
application can choose the size of the DMABUFs as well as how many it
allocates. E.g. two DMABUFs would be a traditional double buffering
scheme. But using a higher number might be necessary to avoid
underflow/overflow situations in the presence of scheduling latencies.

As part of the interface, 2 new IOCTLs have been added:

IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *):
 Each call will allocate a new DMABUF object. The return value (if not
 a negative errno value as error) will be the file descriptor of the new
 DMABUF.

IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
 Place the DMABUF object into the queue pending for hardware process.

These two IOCTLs have to be performed on the IIO buffer's file
descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.

To access the data stored in a block by userspace the block must be
mapped to the process's memory. This is done by calling mmap() on the
DMABUF's file descriptor.

Before accessing the data through the map, you must use the
DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
DMA_BUF_SYNC_START flag, to make sure that the data is available.
This call may block until the hardware is done with this block. Once
you are done reading or writing the data, you must use this ioctl again
with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
kernel's queue.

If you need to know when the hardware is done with a DMABUF, you can
poll its file descriptor for the EPOLLOUT event.

Finally, to destroy a DMABUF object, simply call close() on its file
descriptor.

A typical workflow for the new interface is:

  for block in blocks:
    DMABUF_ALLOC block
    mmap block

  enable buffer

  while !done
    for block in blocks:
      DMABUF_ENQUEUE block

      DMABUF_SYNC_START block
      process data
      DMABUF_SYNC_END block

  disable buffer

  for block in blocks:
    close block

v2: Only allow the new IOCTLs on the buffer FD created with
    IIO_BUFFER_GET_FD_IOCTL().

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
---
 drivers/iio/industrialio-buffer.c | 55 +++++++++++++++++++++++++++++++
 include/linux/iio/buffer_impl.h   |  8 +++++
 include/uapi/linux/iio/buffer.h   | 29 ++++++++++++++++
 3 files changed, 92 insertions(+)

diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c
index 94eb9f6cf128..72f333a519bc 100644
--- a/drivers/iio/industrialio-buffer.c
+++ b/drivers/iio/industrialio-buffer.c
@@ -17,6 +17,7 @@
 #include <linux/fs.h>
 #include <linux/cdev.h>
 #include <linux/slab.h>
+#include <linux/mm.h>
 #include <linux/poll.h>
 #include <linux/sched/signal.h>
 
@@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct inode *inode, struct file *filep)
 	return 0;
 }
 
+static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
+				     struct iio_dmabuf __user *user_buf)
+{
+	struct iio_dmabuf dmabuf;
+
+	if (!buffer->access->enqueue_dmabuf)
+		return -EPERM;
+
+	if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf)))
+		return -EFAULT;
+
+	if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
+		return -EINVAL;
+
+	return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
+}
+
+static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer,
+				   struct iio_dmabuf_alloc_req __user *user_req)
+{
+	struct iio_dmabuf_alloc_req req;
+
+	if (!buffer->access->alloc_dmabuf)
+		return -EPERM;
+
+	if (copy_from_user(&req, user_req, sizeof(req)))
+		return -EFAULT;
+
+	if (req.resv)
+		return -EINVAL;
+
+	return buffer->access->alloc_dmabuf(buffer, &req);
+}
+
+static long iio_buffer_chrdev_ioctl(struct file *filp,
+				    unsigned int cmd, unsigned long arg)
+{
+	struct iio_dev_buffer_pair *ib = filp->private_data;
+	struct iio_buffer *buffer = ib->buffer;
+	void __user *_arg = (void __user *)arg;
+
+	switch (cmd) {
+	case IIO_BUFFER_DMABUF_ALLOC_IOCTL:
+		return iio_buffer_alloc_dmabuf(buffer, _arg);
+	case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL:
+		/* TODO: support non-blocking enqueue operation */
+		return iio_buffer_enqueue_dmabuf(buffer, _arg);
+	default:
+		return IIO_IOCTL_UNHANDLED;
+	}
+}
+
 static const struct file_operations iio_buffer_chrdev_fileops = {
 	.owner = THIS_MODULE,
 	.llseek = noop_llseek,
 	.read = iio_buffer_read,
 	.write = iio_buffer_write,
+	.unlocked_ioctl = iio_buffer_chrdev_ioctl,
+	.compat_ioctl = compat_ptr_ioctl,
 	.poll = iio_buffer_poll,
 	.release = iio_buffer_chrdev_release,
 };
diff --git a/include/linux/iio/buffer_impl.h b/include/linux/iio/buffer_impl.h
index e2ca8ea23e19..728541bc2c63 100644
--- a/include/linux/iio/buffer_impl.h
+++ b/include/linux/iio/buffer_impl.h
@@ -39,6 +39,9 @@ struct iio_buffer;
  *                      device stops sampling. Calles are balanced with @enable.
  * @release:		called when the last reference to the buffer is dropped,
  *			should free all resources allocated by the buffer.
+ * @alloc_dmabuf:	called from userspace via ioctl to allocate one DMABUF.
+ * @enqueue_dmabuf:	called from userspace via ioctl to queue this DMABUF
+ *			object to this buffer. Requires a valid DMABUF fd.
  * @modes:		Supported operating modes by this buffer type
  * @flags:		A bitmask combination of INDIO_BUFFER_FLAG_*
  *
@@ -68,6 +71,11 @@ struct iio_buffer_access_funcs {
 
 	void (*release)(struct iio_buffer *buffer);
 
+	int (*alloc_dmabuf)(struct iio_buffer *buffer,
+			    struct iio_dmabuf_alloc_req *req);
+	int (*enqueue_dmabuf)(struct iio_buffer *buffer,
+			      struct iio_dmabuf *block);
+
 	unsigned int modes;
 	unsigned int flags;
 };
diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h
index 13939032b3f6..e4621b926262 100644
--- a/include/uapi/linux/iio/buffer.h
+++ b/include/uapi/linux/iio/buffer.h
@@ -5,6 +5,35 @@
 #ifndef _UAPI_IIO_BUFFER_H_
 #define _UAPI_IIO_BUFFER_H_
 
+#include <linux/types.h>
+
+#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS	0x00000000
+
+/**
+ * struct iio_dmabuf_alloc_req - Descriptor for allocating IIO DMABUFs
+ * @size:	the size of a single DMABUF
+ * @resv:	reserved
+ */
+struct iio_dmabuf_alloc_req {
+	__u64 size;
+	__u64 resv;
+};
+
+/**
+ * struct iio_dmabuf - Descriptor for a single IIO DMABUF object
+ * @fd:		file descriptor of the DMABUF object
+ * @flags:	one or more IIO_BUFFER_DMABUF_* flags
+ * @bytes_used:	number of bytes used in this DMABUF for the data transfer.
+ *		If zero, the full buffer is used.
+ */
+struct iio_dmabuf {
+	__u32 fd;
+	__u32 flags;
+	__u64 bytes_used;
+};
+
 #define IIO_BUFFER_GET_FD_IOCTL			_IOWR('i', 0x91, int)
+#define IIO_BUFFER_DMABUF_ALLOC_IOCTL		_IOW('i', 0x92, struct iio_dmabuf_alloc_req)
+#define IIO_BUFFER_DMABUF_ENQUEUE_IOCTL		_IOW('i', 0x93, struct iio_dmabuf)
 
 #endif /* _UAPI_IIO_BUFFER_H_ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 06/12] iio: buffer-dma: split iio_dma_buffer_fileio_free() function
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (4 preceding siblings ...)
  2022-02-07 12:59 ` [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-02-07 12:59 ` [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution Paul Cercueil
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio,
	Alexandru Ardelean, Paul Cercueil

From: Alexandru Ardelean <alexandru.ardelean@analog.com>

A part of the logic in the iio_dma_buffer_exit() is required for the change
to add mmap support to IIO buffers.
This change splits the logic into a separate function, which will be
re-used later.

Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Cc: Alexandru Ardelean <ardeleanalex@gmail.com>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 43 +++++++++++---------
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
index a9f1b673374f..15ea7bc3ac08 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -374,6 +374,29 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_request_update);
 
+static void iio_dma_buffer_fileio_free(struct iio_dma_buffer_queue *queue)
+{
+	unsigned int i;
+
+	spin_lock_irq(&queue->list_lock);
+	for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
+		if (!queue->fileio.blocks[i])
+			continue;
+		queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD;
+	}
+	spin_unlock_irq(&queue->list_lock);
+
+	INIT_LIST_HEAD(&queue->incoming);
+
+	for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
+		if (!queue->fileio.blocks[i])
+			continue;
+		iio_buffer_block_put(queue->fileio.blocks[i]);
+		queue->fileio.blocks[i] = NULL;
+	}
+	queue->fileio.active_block = NULL;
+}
+
 static void iio_dma_buffer_submit_block(struct iio_dma_buffer_queue *queue,
 	struct iio_dma_buffer_block *block)
 {
@@ -694,27 +717,9 @@ EXPORT_SYMBOL_GPL(iio_dma_buffer_init);
  */
 void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue)
 {
-	unsigned int i;
-
 	mutex_lock(&queue->lock);
 
-	spin_lock_irq(&queue->list_lock);
-	for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
-		if (!queue->fileio.blocks[i])
-			continue;
-		queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD;
-	}
-	spin_unlock_irq(&queue->list_lock);
-
-	INIT_LIST_HEAD(&queue->incoming);
-
-	for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
-		if (!queue->fileio.blocks[i])
-			continue;
-		iio_buffer_block_put(queue->fileio.blocks[i]);
-		queue->fileio.blocks[i] = NULL;
-	}
-	queue->fileio.active_block = NULL;
+	iio_dma_buffer_fileio_free(queue);
 	queue->ops = NULL;
 
 	mutex_unlock(&queue->lock);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (5 preceding siblings ...)
  2022-02-07 12:59 ` [PATCH v2 06/12] iio: buffer-dma: split iio_dma_buffer_fileio_free() function Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-03-28 17:54   ` Jonathan Cameron
  2022-02-07 12:59 ` [PATCH v2 08/12] iio: buffer-dma: Implement new DMABUF based userspace API Paul Cercueil
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Enhance the current fileio code by using DMABUF objects instead of
custom buffers.

This adds more code than it removes, but:
- a lot of the complexity can be dropped, e.g. custom kref and
  iio_buffer_block_put_atomic() are not needed anymore;
- it will be much easier to introduce an API to export these DMABUF
  objects to userspace in a following patch.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 192 ++++++++++++-------
 include/linux/iio/buffer-dma.h               |   8 +-
 2 files changed, 122 insertions(+), 78 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
index 15ea7bc3ac08..54e6000cd2ee 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -14,6 +14,7 @@
 #include <linux/poll.h>
 #include <linux/iio/buffer_impl.h>
 #include <linux/iio/buffer-dma.h>
+#include <linux/dma-buf.h>
 #include <linux/dma-mapping.h>
 #include <linux/sizes.h>
 
@@ -90,103 +91,145 @@
  * callback is called from within the custom callback.
  */
 
-static void iio_buffer_block_release(struct kref *kref)
-{
-	struct iio_dma_buffer_block *block = container_of(kref,
-		struct iio_dma_buffer_block, kref);
-
-	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
-
-	dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size),
-					block->vaddr, block->phys_addr);
-
-	iio_buffer_put(&block->queue->buffer);
-	kfree(block);
-}
-
-static void iio_buffer_block_get(struct iio_dma_buffer_block *block)
-{
-	kref_get(&block->kref);
-}
-
-static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
-{
-	kref_put(&block->kref, iio_buffer_block_release);
-}
-
-/*
- * dma_free_coherent can sleep, hence we need to take some special care to be
- * able to drop a reference from an atomic context.
- */
-static LIST_HEAD(iio_dma_buffer_dead_blocks);
-static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock);
-
-static void iio_dma_buffer_cleanup_worker(struct work_struct *work)
-{
-	struct iio_dma_buffer_block *block, *_block;
-	LIST_HEAD(block_list);
-
-	spin_lock_irq(&iio_dma_buffer_dead_blocks_lock);
-	list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list);
-	spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock);
-
-	list_for_each_entry_safe(block, _block, &block_list, head)
-		iio_buffer_block_release(&block->kref);
-}
-static DECLARE_WORK(iio_dma_buffer_cleanup_work, iio_dma_buffer_cleanup_worker);
-
-static void iio_buffer_block_release_atomic(struct kref *kref)
-{
+struct iio_buffer_dma_buf_attachment {
+	struct scatterlist sgl;
+	struct sg_table sg_table;
 	struct iio_dma_buffer_block *block;
-	unsigned long flags;
-
-	block = container_of(kref, struct iio_dma_buffer_block, kref);
-
-	spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags);
-	list_add_tail(&block->head, &iio_dma_buffer_dead_blocks);
-	spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags);
-
-	schedule_work(&iio_dma_buffer_cleanup_work);
-}
-
-/*
- * Version of iio_buffer_block_put() that can be called from atomic context
- */
-static void iio_buffer_block_put_atomic(struct iio_dma_buffer_block *block)
-{
-	kref_put(&block->kref, iio_buffer_block_release_atomic);
-}
+};
 
 static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf)
 {
 	return container_of(buf, struct iio_dma_buffer_queue, buffer);
 }
 
+static struct iio_buffer_dma_buf_attachment *
+to_iio_buffer_dma_buf_attachment(struct sg_table *table)
+{
+	return container_of(table, struct iio_buffer_dma_buf_attachment, sg_table);
+}
+
+static void iio_buffer_block_get(struct iio_dma_buffer_block *block)
+{
+	get_dma_buf(block->dmabuf);
+}
+
+static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
+{
+	dma_buf_put(block->dmabuf);
+}
+
+static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
+				     struct dma_buf_attachment *at)
+{
+	at->priv = dbuf->priv;
+
+	return 0;
+}
+
+static struct sg_table *iio_buffer_dma_buf_map(struct dma_buf_attachment *at,
+					       enum dma_data_direction dma_dir)
+{
+	struct iio_dma_buffer_block *block = at->priv;
+	struct iio_buffer_dma_buf_attachment *dba;
+	int ret;
+
+	dba = kzalloc(sizeof(*dba), GFP_KERNEL);
+	if (!dba)
+		return ERR_PTR(-ENOMEM);
+
+	sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size));
+	dba->sg_table.sgl = &dba->sgl;
+	dba->sg_table.nents = 1;
+	dba->block = block;
+
+	ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
+	if (ret) {
+		kfree(dba);
+		return ERR_PTR(ret);
+	}
+
+	return &dba->sg_table;
+}
+
+static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
+				     struct sg_table *sg_table,
+				     enum dma_data_direction dma_dir)
+{
+	struct iio_buffer_dma_buf_attachment *dba =
+		to_iio_buffer_dma_buf_attachment(sg_table);
+
+	dma_unmap_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
+	kfree(dba);
+}
+
+static void iio_buffer_dma_buf_release(struct dma_buf *dbuf)
+{
+	struct iio_dma_buffer_block *block = dbuf->priv;
+	struct iio_dma_buffer_queue *queue = block->queue;
+
+	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
+
+	mutex_lock(&queue->lock);
+
+	dma_free_coherent(queue->dev, PAGE_ALIGN(block->size),
+			  block->vaddr, block->phys_addr);
+	kfree(block);
+
+	mutex_unlock(&queue->lock);
+	iio_buffer_put(&queue->buffer);
+}
+
+static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = {
+	.attach			= iio_buffer_dma_buf_attach,
+	.map_dma_buf		= iio_buffer_dma_buf_map,
+	.unmap_dma_buf		= iio_buffer_dma_buf_unmap,
+	.release		= iio_buffer_dma_buf_release,
+};
+
 static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
 	struct iio_dma_buffer_queue *queue, size_t size)
 {
 	struct iio_dma_buffer_block *block;
+	DEFINE_DMA_BUF_EXPORT_INFO(einfo);
+	struct dma_buf *dmabuf;
+	int err = -ENOMEM;
 
 	block = kzalloc(sizeof(*block), GFP_KERNEL);
 	if (!block)
-		return NULL;
+		return ERR_PTR(err);
 
 	block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size),
 		&block->phys_addr, GFP_KERNEL);
-	if (!block->vaddr) {
-		kfree(block);
-		return NULL;
+	if (!block->vaddr)
+		goto err_free_block;
+
+	einfo.ops = &iio_dma_buffer_dmabuf_ops;
+	einfo.size = PAGE_ALIGN(size);
+	einfo.priv = block;
+	einfo.flags = O_RDWR;
+
+	dmabuf = dma_buf_export(&einfo);
+	if (IS_ERR(dmabuf)) {
+		err = PTR_ERR(dmabuf);
+		goto err_free_dma;
 	}
 
+	block->dmabuf = dmabuf;
 	block->size = size;
 	block->state = IIO_BLOCK_STATE_DONE;
 	block->queue = queue;
 	INIT_LIST_HEAD(&block->head);
-	kref_init(&block->kref);
 
 	iio_buffer_get(&queue->buffer);
 
 	return block;
+
+err_free_dma:
+	dma_free_coherent(queue->dev, PAGE_ALIGN(size),
+			  block->vaddr, block->phys_addr);
+err_free_block:
+	kfree(block);
+	return ERR_PTR(err);
 }
 
 static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
@@ -223,7 +266,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
 	_iio_dma_buffer_block_done(block);
 	spin_unlock_irqrestore(&queue->list_lock, flags);
 
-	iio_buffer_block_put_atomic(block);
+	iio_buffer_block_put(block);
 	iio_dma_buffer_queue_wake(queue);
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
@@ -249,7 +292,8 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue,
 		list_del(&block->head);
 		block->bytes_used = 0;
 		_iio_dma_buffer_block_done(block);
-		iio_buffer_block_put_atomic(block);
+
+		iio_buffer_block_put(block);
 	}
 	spin_unlock_irqrestore(&queue->list_lock, flags);
 
@@ -340,8 +384,8 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
 
 		if (!block) {
 			block = iio_dma_buffer_alloc_block(queue, size);
-			if (!block) {
-				ret = -ENOMEM;
+			if (IS_ERR(block)) {
+				ret = PTR_ERR(block);
 				goto out_unlock;
 			}
 			queue->fileio.blocks[i] = block;
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
index 490b93f76fa8..6b3fa7d2124b 100644
--- a/include/linux/iio/buffer-dma.h
+++ b/include/linux/iio/buffer-dma.h
@@ -8,7 +8,6 @@
 #define __INDUSTRIALIO_DMA_BUFFER_H__
 
 #include <linux/list.h>
-#include <linux/kref.h>
 #include <linux/spinlock.h>
 #include <linux/mutex.h>
 #include <linux/iio/buffer_impl.h>
@@ -16,6 +15,7 @@
 struct iio_dma_buffer_queue;
 struct iio_dma_buffer_ops;
 struct device;
+struct dma_buf;
 
 /**
  * enum iio_block_state - State of a struct iio_dma_buffer_block
@@ -39,8 +39,8 @@ enum iio_block_state {
  * @vaddr: Virutal address of the blocks memory
  * @phys_addr: Physical address of the blocks memory
  * @queue: Parent DMA buffer queue
- * @kref: kref used to manage the lifetime of block
  * @state: Current state of the block
+ * @dmabuf: Underlying DMABUF object
  */
 struct iio_dma_buffer_block {
 	/* May only be accessed by the owner of the block */
@@ -56,13 +56,13 @@ struct iio_dma_buffer_block {
 	size_t size;
 	struct iio_dma_buffer_queue *queue;
 
-	/* Must not be accessed outside the core. */
-	struct kref kref;
 	/*
 	 * Must not be accessed outside the core. Access needs to hold
 	 * queue->list_lock if the block is not owned by the core.
 	 */
 	enum iio_block_state state;
+
+	struct dma_buf *dmabuf;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 08/12] iio: buffer-dma: Implement new DMABUF based userspace API
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (6 preceding siblings ...)
  2022-02-07 12:59 ` [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-02-07 12:59 ` [PATCH v2 09/12] iio: buffer-dmaengine: Support " Paul Cercueil
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Implement the two functions iio_dma_buffer_alloc_dmabuf() and
iio_dma_buffer_enqueue_dmabuf(), as well as all the necessary bits to
enable userspace access to the DMABUF objects.

These two functions are exported as GPL symbols so that IIO buffer
implementations can support the new DMABUF based userspace API.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 260 ++++++++++++++++++-
 include/linux/iio/buffer-dma.h               |  13 +
 2 files changed, 266 insertions(+), 7 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
index 54e6000cd2ee..b9c3b01c5ea0 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -15,7 +15,9 @@
 #include <linux/iio/buffer_impl.h>
 #include <linux/iio/buffer-dma.h>
 #include <linux/dma-buf.h>
+#include <linux/dma-fence.h>
 #include <linux/dma-mapping.h>
+#include <linux/dma-resv.h>
 #include <linux/sizes.h>
 
 /*
@@ -97,6 +99,18 @@ struct iio_buffer_dma_buf_attachment {
 	struct iio_dma_buffer_block *block;
 };
 
+struct iio_buffer_dma_fence {
+	struct dma_fence base;
+	struct iio_dma_buffer_block *block;
+	spinlock_t lock;
+};
+
+static struct iio_buffer_dma_fence *
+to_iio_buffer_dma_fence(struct dma_fence *fence)
+{
+	return container_of(fence, struct iio_buffer_dma_fence, base);
+}
+
 static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf)
 {
 	return container_of(buf, struct iio_dma_buffer_queue, buffer);
@@ -118,6 +132,48 @@ static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
 	dma_buf_put(block->dmabuf);
 }
 
+static const char *
+iio_buffer_dma_fence_get_driver_name(struct dma_fence *fence)
+{
+	struct iio_buffer_dma_fence *iio_fence = to_iio_buffer_dma_fence(fence);
+
+	return dev_name(iio_fence->block->queue->dev);
+}
+
+static void iio_buffer_dma_fence_release(struct dma_fence *fence)
+{
+	struct iio_buffer_dma_fence *iio_fence = to_iio_buffer_dma_fence(fence);
+
+	kfree(iio_fence);
+}
+
+static const struct dma_fence_ops iio_buffer_dma_fence_ops = {
+	.get_driver_name	= iio_buffer_dma_fence_get_driver_name,
+	.get_timeline_name	= iio_buffer_dma_fence_get_driver_name,
+	.release		= iio_buffer_dma_fence_release,
+};
+
+static struct dma_fence *
+iio_dma_buffer_create_dma_fence(struct iio_dma_buffer_block *block)
+{
+	struct iio_buffer_dma_fence *fence;
+	u64 ctx;
+
+	fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+	if (!fence)
+		return ERR_PTR(-ENOMEM);
+
+	fence->block = block;
+	spin_lock_init(&fence->lock);
+
+	ctx = dma_fence_context_alloc(1);
+
+	dma_fence_init(&fence->base, &iio_buffer_dma_fence_ops,
+		       &fence->lock, ctx, 0);
+
+	return &fence->base;
+}
+
 static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
 				     struct dma_buf_attachment *at)
 {
@@ -162,10 +218,26 @@ static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
 	kfree(dba);
 }
 
+static int iio_buffer_dma_buf_mmap(struct dma_buf *dbuf,
+				   struct vm_area_struct *vma)
+{
+	struct iio_dma_buffer_block *block = dbuf->priv;
+	struct device *dev = block->queue->dev;
+
+	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
+
+	if (vma->vm_ops->open)
+		vma->vm_ops->open(vma);
+
+	return dma_mmap_coherent(dev, vma, block->vaddr, block->phys_addr,
+				 vma->vm_end - vma->vm_start);
+}
+
 static void iio_buffer_dma_buf_release(struct dma_buf *dbuf)
 {
 	struct iio_dma_buffer_block *block = dbuf->priv;
 	struct iio_dma_buffer_queue *queue = block->queue;
+	bool is_fileio = block->fileio;
 
 	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
 
@@ -175,6 +247,9 @@ static void iio_buffer_dma_buf_release(struct dma_buf *dbuf)
 			  block->vaddr, block->phys_addr);
 	kfree(block);
 
+	queue->num_blocks--;
+	if (is_fileio)
+		queue->num_fileio_blocks--;
 	mutex_unlock(&queue->lock);
 	iio_buffer_put(&queue->buffer);
 }
@@ -183,11 +258,12 @@ static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = {
 	.attach			= iio_buffer_dma_buf_attach,
 	.map_dma_buf		= iio_buffer_dma_buf_map,
 	.unmap_dma_buf		= iio_buffer_dma_buf_unmap,
+	.mmap			= iio_buffer_dma_buf_mmap,
 	.release		= iio_buffer_dma_buf_release,
 };
 
 static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
-	struct iio_dma_buffer_queue *queue, size_t size)
+	struct iio_dma_buffer_queue *queue, size_t size, bool fileio)
 {
 	struct iio_dma_buffer_block *block;
 	DEFINE_DMA_BUF_EXPORT_INFO(einfo);
@@ -218,10 +294,15 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
 	block->size = size;
 	block->state = IIO_BLOCK_STATE_DONE;
 	block->queue = queue;
+	block->fileio = fileio;
 	INIT_LIST_HEAD(&block->head);
 
 	iio_buffer_get(&queue->buffer);
 
+	queue->num_blocks++;
+	if (fileio)
+		queue->num_fileio_blocks++;
+
 	return block;
 
 err_free_dma:
@@ -260,14 +341,23 @@ static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue *queue)
 void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
 {
 	struct iio_dma_buffer_queue *queue = block->queue;
+	struct dma_resv *resv = block->dmabuf->resv;
+	struct dma_fence *fence;
 	unsigned long flags;
 
 	spin_lock_irqsave(&queue->list_lock, flags);
 	_iio_dma_buffer_block_done(block);
 	spin_unlock_irqrestore(&queue->list_lock, flags);
 
+	fence = dma_resv_excl_fence(resv);
+	if (fence)
+		dma_fence_signal(fence);
+	dma_resv_unlock(resv);
+
 	iio_buffer_block_put(block);
-	iio_dma_buffer_queue_wake(queue);
+
+	if (queue->fileio.enabled)
+		iio_dma_buffer_queue_wake(queue);
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
 
@@ -293,6 +383,9 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue,
 		block->bytes_used = 0;
 		_iio_dma_buffer_block_done(block);
 
+		if (dma_resv_is_locked(block->dmabuf->resv))
+			dma_resv_unlock(block->dmabuf->resv);
+
 		iio_buffer_block_put(block);
 	}
 	spin_unlock_irqrestore(&queue->list_lock, flags);
@@ -317,6 +410,12 @@ static bool iio_dma_block_reusable(struct iio_dma_buffer_block *block)
 	}
 }
 
+static bool iio_dma_buffer_fileio_mode(struct iio_dma_buffer_queue *queue)
+{
+	return queue->fileio.enabled ||
+		queue->num_blocks == queue->num_fileio_blocks;
+}
+
 /**
  * iio_dma_buffer_request_update() - DMA buffer request_update callback
  * @buffer: The buffer which to request an update
@@ -343,6 +442,12 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
 
 	mutex_lock(&queue->lock);
 
+	queue->fileio.enabled = iio_dma_buffer_fileio_mode(queue);
+
+	/* If DMABUFs were created, disable fileio interface */
+	if (!queue->fileio.enabled)
+		goto out_unlock;
+
 	/* Allocations are page aligned */
 	if (PAGE_ALIGN(queue->fileio.block_size) == PAGE_ALIGN(size))
 		try_reuse = true;
@@ -383,7 +488,7 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
 		}
 
 		if (!block) {
-			block = iio_dma_buffer_alloc_block(queue, size);
+			block = iio_dma_buffer_alloc_block(queue, size, true);
 			if (IS_ERR(block)) {
 				ret = PTR_ERR(block);
 				goto out_unlock;
@@ -403,12 +508,10 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
 		 * iio_dma_buffer_enable() will submit it. Otherwise mark it as
 		 * done, which means it's ready to be dequeued.
 		 */
-		if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
+		if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN)
 			block->state = IIO_BLOCK_STATE_QUEUED;
-			list_add_tail(&block->head, &queue->incoming);
-		} else {
+		else
 			block->state = IIO_BLOCK_STATE_DONE;
-		}
 	}
 
 out_unlock:
@@ -456,6 +559,8 @@ static void iio_dma_buffer_submit_block(struct iio_dma_buffer_queue *queue,
 
 	block->state = IIO_BLOCK_STATE_ACTIVE;
 	iio_buffer_block_get(block);
+	dma_resv_lock(block->dmabuf->resv, NULL);
+
 	ret = queue->ops->submit(queue, block);
 	if (ret) {
 		/*
@@ -487,12 +592,31 @@ int iio_dma_buffer_enable(struct iio_buffer *buffer,
 {
 	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
 	struct iio_dma_buffer_block *block, *_block;
+	unsigned int i;
 
 	mutex_lock(&queue->lock);
 	queue->active = true;
+	queue->fileio.next_dequeue = 0;
+	queue->fileio.enabled = iio_dma_buffer_fileio_mode(queue);
+
+	dev_dbg(queue->dev, "Buffer enabled in %s mode\n",
+		queue->fileio.enabled ? "fileio" : "dmabuf");
+
+	if (queue->fileio.enabled) {
+		for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
+			block = queue->fileio.blocks[i];
+
+			if (block->state == IIO_BLOCK_STATE_QUEUED) {
+				iio_buffer_block_get(block);
+				list_add_tail(&block->head, &queue->incoming);
+			}
+		}
+	}
+
 	list_for_each_entry_safe(block, _block, &queue->incoming, head) {
 		list_del(&block->head);
 		iio_dma_buffer_submit_block(queue, block);
+		iio_buffer_block_put(block);
 	}
 	mutex_unlock(&queue->lock);
 
@@ -514,6 +638,7 @@ int iio_dma_buffer_disable(struct iio_buffer *buffer,
 	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
 
 	mutex_lock(&queue->lock);
+	queue->fileio.enabled = false;
 	queue->active = false;
 
 	if (queue->ops && queue->ops->abort)
@@ -533,6 +658,7 @@ static void iio_dma_buffer_enqueue(struct iio_dma_buffer_queue *queue,
 		iio_dma_buffer_submit_block(queue, block);
 	} else {
 		block->state = IIO_BLOCK_STATE_QUEUED;
+		iio_buffer_block_get(block);
 		list_add_tail(&block->head, &queue->incoming);
 	}
 }
@@ -573,6 +699,11 @@ static int iio_dma_buffer_io(struct iio_buffer *buffer,
 
 	mutex_lock(&queue->lock);
 
+	if (!queue->fileio.enabled) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+
 	if (!queue->fileio.active_block) {
 		block = iio_dma_buffer_dequeue(queue);
 		if (block == NULL) {
@@ -688,6 +819,121 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf)
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_data_available);
 
+int iio_dma_buffer_alloc_dmabuf(struct iio_buffer *buffer,
+				struct iio_dmabuf_alloc_req *req)
+{
+	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
+	struct iio_dma_buffer_block *block;
+	int ret = 0;
+
+	mutex_lock(&queue->lock);
+
+	/*
+	 * If the buffer is enabled and in fileio mode new blocks can't be
+	 * allocated.
+	 */
+	if (queue->fileio.enabled) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+
+	if (!req->size || req->size > SIZE_MAX) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	/* Free memory that might be in use for fileio mode */
+	iio_dma_buffer_fileio_free(queue);
+
+	block = iio_dma_buffer_alloc_block(queue, req->size, false);
+	if (IS_ERR(block)) {
+		ret = PTR_ERR(block);
+		goto out_unlock;
+	}
+
+	ret = dma_buf_fd(block->dmabuf, O_CLOEXEC);
+	if (ret < 0) {
+		dma_buf_put(block->dmabuf);
+		goto out_unlock;
+	}
+
+out_unlock:
+	mutex_unlock(&queue->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iio_dma_buffer_alloc_dmabuf);
+
+int iio_dma_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
+				  struct iio_dmabuf *iio_dmabuf)
+{
+	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
+	struct iio_dma_buffer_block *dma_block;
+	struct dma_fence *fence;
+	struct dma_buf *dmabuf;
+	int ret = 0;
+
+	mutex_lock(&queue->lock);
+
+	/* If in fileio mode buffers can't be enqueued. */
+	if (queue->fileio.enabled) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+
+	dmabuf = dma_buf_get(iio_dmabuf->fd);
+	if (IS_ERR(dmabuf)) {
+		ret = PTR_ERR(dmabuf);
+		goto out_unlock;
+	}
+
+	if (dmabuf->ops != &iio_dma_buffer_dmabuf_ops) {
+		dev_err(queue->dev, "importing DMABUFs from other drivers is not yet supported.\n");
+		ret = -EINVAL;
+		goto out_dma_buf_put;
+	}
+
+	dma_block = dmabuf->priv;
+
+	if (iio_dmabuf->bytes_used > dma_block->size) {
+		ret = -EINVAL;
+		goto out_dma_buf_put;
+	}
+
+	dma_block->bytes_used = iio_dmabuf->bytes_used ?: dma_block->size;
+
+	switch (dma_block->state) {
+	case IIO_BLOCK_STATE_QUEUED:
+		/* Nothing to do */
+		goto out_unlock;
+	case IIO_BLOCK_STATE_DONE:
+		break;
+	default:
+		ret = -EBUSY;
+		goto out_dma_buf_put;
+	}
+
+	fence = iio_dma_buffer_create_dma_fence(dma_block);
+	if (IS_ERR(fence)) {
+		ret = PTR_ERR(fence);
+		goto out_dma_buf_put;
+	}
+
+	dma_resv_lock(dmabuf->resv, NULL);
+	dma_resv_add_excl_fence(dmabuf->resv, fence);
+	dma_resv_unlock(dmabuf->resv);
+
+	iio_dma_buffer_enqueue(queue, dma_block);
+
+out_dma_buf_put:
+	dma_buf_put(dmabuf);
+out_unlock:
+	mutex_unlock(&queue->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iio_dma_buffer_enqueue_dmabuf);
+
 /**
  * iio_dma_buffer_set_bytes_per_datum() - DMA buffer set_bytes_per_datum callback
  * @buffer: Buffer to set the bytes-per-datum for
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
index 6b3fa7d2124b..5bd687132355 100644
--- a/include/linux/iio/buffer-dma.h
+++ b/include/linux/iio/buffer-dma.h
@@ -40,6 +40,7 @@ enum iio_block_state {
  * @phys_addr: Physical address of the blocks memory
  * @queue: Parent DMA buffer queue
  * @state: Current state of the block
+ * @fileio: True if this buffer is used for fileio mode
  * @dmabuf: Underlying DMABUF object
  */
 struct iio_dma_buffer_block {
@@ -62,6 +63,7 @@ struct iio_dma_buffer_block {
 	 */
 	enum iio_block_state state;
 
+	bool fileio;
 	struct dma_buf *dmabuf;
 };
 
@@ -72,6 +74,7 @@ struct iio_dma_buffer_block {
  * @pos: Read offset in the active block
  * @block_size: Size of each block
  * @next_dequeue: index of next block that will be dequeued
+ * @enabled: Whether the buffer is operating in fileio mode
  */
 struct iio_dma_buffer_queue_fileio {
 	struct iio_dma_buffer_block *blocks[2];
@@ -80,6 +83,7 @@ struct iio_dma_buffer_queue_fileio {
 	size_t block_size;
 
 	unsigned int next_dequeue;
+	bool enabled;
 };
 
 /**
@@ -95,6 +99,8 @@ struct iio_dma_buffer_queue_fileio {
  *   the DMA controller
  * @incoming: List of buffers on the incoming queue
  * @active: Whether the buffer is currently active
+ * @num_blocks: Total number of blocks in the queue
+ * @num_fileio_blocks: Number of blocks used for fileio interface
  * @fileio: FileIO state
  */
 struct iio_dma_buffer_queue {
@@ -107,6 +113,8 @@ struct iio_dma_buffer_queue {
 	struct list_head incoming;
 
 	bool active;
+	unsigned int num_blocks;
+	unsigned int num_fileio_blocks;
 
 	struct iio_dma_buffer_queue_fileio fileio;
 };
@@ -149,4 +157,9 @@ static inline size_t iio_dma_buffer_space_available(struct iio_buffer *buffer)
 	return iio_dma_buffer_data_available(buffer);
 }
 
+int iio_dma_buffer_alloc_dmabuf(struct iio_buffer *buffer,
+				struct iio_dmabuf_alloc_req *req);
+int iio_dma_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
+				  struct iio_dmabuf *dmabuf);
+
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 09/12] iio: buffer-dmaengine: Support new DMABUF based userspace API
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (7 preceding siblings ...)
  2022-02-07 12:59 ` [PATCH v2 08/12] iio: buffer-dma: Implement new DMABUF based userspace API Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-02-07 12:59 ` [PATCH v2 10/12] iio: core: Add support for cyclic buffers Paul Cercueil
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Use the functions provided by the buffer-dma core to implement the
DMABUF userspace API in the buffer-dmaengine IIO buffer implementation.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
---
 drivers/iio/buffer/industrialio-buffer-dmaengine.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
index 5cde8fd81c7f..57a8b2e4ba3c 100644
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
+++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
@@ -133,6 +133,9 @@ static const struct iio_buffer_access_funcs iio_dmaengine_buffer_ops = {
 	.space_available = iio_dma_buffer_space_available,
 	.release = iio_dmaengine_buffer_release,
 
+	.alloc_dmabuf = iio_dma_buffer_alloc_dmabuf,
+	.enqueue_dmabuf = iio_dma_buffer_enqueue_dmabuf,
+
 	.modes = INDIO_BUFFER_HARDWARE,
 	.flags = INDIO_BUFFER_FLAG_FIXED_WATERMARK,
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 10/12] iio: core: Add support for cyclic buffers
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (8 preceding siblings ...)
  2022-02-07 12:59 ` [PATCH v2 09/12] iio: buffer-dmaengine: Support " Paul Cercueil
@ 2022-02-07 12:59 ` Paul Cercueil
  2022-02-07 13:01 ` [PATCH v2 11/12] iio: buffer-dmaengine: " Paul Cercueil
  2022-02-13 18:46 ` [PATCH v2 00/12] iio: buffer-dma: write() and new " Jonathan Cameron
  11 siblings, 0 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 12:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Introduce a new flag IIO_BUFFER_DMABUF_CYCLIC in the "flags" field of
the iio_dmabuf uapi structure.

When set, the DMABUF enqueued with the enqueue ioctl will be endlessly
repeated on the TX output, until the buffer is disabled.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
---
 drivers/iio/industrialio-buffer.c | 5 +++++
 include/uapi/linux/iio/buffer.h   | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c
index 72f333a519bc..85331cedaad8 100644
--- a/drivers/iio/industrialio-buffer.c
+++ b/drivers/iio/industrialio-buffer.c
@@ -1535,6 +1535,11 @@ static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
 	if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
 		return -EINVAL;
 
+	/* Cyclic flag is only supported on output buffers */
+	if ((dmabuf.flags & IIO_BUFFER_DMABUF_CYCLIC) &&
+	    buffer->direction != IIO_BUFFER_DIRECTION_OUT)
+		return -EINVAL;
+
 	return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
 }
 
diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h
index e4621b926262..2d541d038c02 100644
--- a/include/uapi/linux/iio/buffer.h
+++ b/include/uapi/linux/iio/buffer.h
@@ -7,7 +7,8 @@
 
 #include <linux/types.h>
 
-#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS	0x00000000
+#define IIO_BUFFER_DMABUF_CYCLIC		(1 << 0)
+#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS	0x00000001
 
 /**
  * struct iio_dmabuf_alloc_req - Descriptor for allocating IIO DMABUFs
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 11/12] iio: buffer-dmaengine: Add support for cyclic buffers
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (9 preceding siblings ...)
  2022-02-07 12:59 ` [PATCH v2 10/12] iio: core: Add support for cyclic buffers Paul Cercueil
@ 2022-02-07 13:01 ` Paul Cercueil
  2022-02-07 13:01   ` [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API Paul Cercueil
  2022-02-13 18:46 ` [PATCH v2 00/12] iio: buffer-dma: write() and new " Jonathan Cameron
  11 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 13:01 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Handle the IIO_BUFFER_DMABUF_CYCLIC flag to support cyclic buffers.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
---
 drivers/iio/buffer/industrialio-buffer-dma.c      |  1 +
 .../iio/buffer/industrialio-buffer-dmaengine.c    | 15 ++++++++++++---
 include/linux/iio/buffer-dma.h                    |  3 +++
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
index b9c3b01c5ea0..6185af2f33f0 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -901,6 +901,7 @@ int iio_dma_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
 	}
 
 	dma_block->bytes_used = iio_dmabuf->bytes_used ?: dma_block->size;
+	dma_block->cyclic = iio_dmabuf->flags & IIO_BUFFER_DMABUF_CYCLIC;
 
 	switch (dma_block->state) {
 	case IIO_BLOCK_STATE_QUEUED:
diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
index 57a8b2e4ba3c..952e2160a11e 100644
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
+++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
@@ -81,9 +81,18 @@ static int iio_dmaengine_buffer_submit_block(struct iio_dma_buffer_queue *queue,
 	if (!block->bytes_used || block->bytes_used > max_size)
 		return -EINVAL;
 
-	desc = dmaengine_prep_slave_single(dmaengine_buffer->chan,
-		block->phys_addr, block->bytes_used, dma_dir,
-		DMA_PREP_INTERRUPT);
+	if (block->cyclic) {
+		desc = dmaengine_prep_dma_cyclic(dmaengine_buffer->chan,
+						 block->phys_addr,
+						 block->size,
+						 block->bytes_used,
+						 dma_dir, 0);
+	} else {
+		desc = dmaengine_prep_slave_single(dmaengine_buffer->chan,
+						   block->phys_addr,
+						   block->bytes_used, dma_dir,
+						   DMA_PREP_INTERRUPT);
+	}
 	if (!desc)
 		return -ENOMEM;
 
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
index 5bd687132355..3a5d9169e573 100644
--- a/include/linux/iio/buffer-dma.h
+++ b/include/linux/iio/buffer-dma.h
@@ -40,6 +40,7 @@ enum iio_block_state {
  * @phys_addr: Physical address of the blocks memory
  * @queue: Parent DMA buffer queue
  * @state: Current state of the block
+ * @cyclic: True if this is a cyclic buffer
  * @fileio: True if this buffer is used for fileio mode
  * @dmabuf: Underlying DMABUF object
  */
@@ -63,6 +64,8 @@ struct iio_dma_buffer_block {
 	 */
 	enum iio_block_state state;
 
+	bool cyclic;
+
 	bool fileio;
 	struct dma_buf *dmabuf;
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API
  2022-02-07 13:01 ` [PATCH v2 11/12] iio: buffer-dmaengine: " Paul Cercueil
@ 2022-02-07 13:01   ` Paul Cercueil
  2022-03-29  8:54     ` Daniel Vetter
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-02-07 13:01 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio, Paul Cercueil

Document the new DMABUF based API.

v2: - Explicitly state that the new interface is optional and is
      not implemented by all drivers.
    - The IOCTLs can now only be called on the buffer FD returned by
      IIO_BUFFER_GET_FD_IOCTL.
    - Move the page up a bit in the index since it is core stuff and not
      driver-specific.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
---
 Documentation/driver-api/dma-buf.rst |  2 +
 Documentation/iio/dmabuf_api.rst     | 94 ++++++++++++++++++++++++++++
 Documentation/iio/index.rst          |  2 +
 3 files changed, 98 insertions(+)
 create mode 100644 Documentation/iio/dmabuf_api.rst

diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 2cd7db82d9fe..d3c9b58d2706 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -1,3 +1,5 @@
+.. _dma-buf:
+
 Buffer Sharing and Synchronization
 ==================================
 
diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst
new file mode 100644
index 000000000000..43bb2c1b9fdc
--- /dev/null
+++ b/Documentation/iio/dmabuf_api.rst
@@ -0,0 +1,94 @@
+===================================
+High-speed DMABUF interface for IIO
+===================================
+
+1. Overview
+===========
+
+The Industrial I/O subsystem supports access to buffers through a file-based
+interface, with read() and write() access calls through the IIO device's dev
+node.
+
+It additionally supports a DMABUF based interface, where the userspace
+application can allocate and append DMABUF objects to the buffer's queue.
+This interface is however optional and is not available in all drivers.
+
+The advantage of this DMABUF based interface vs. the read()
+interface, is that it avoids an extra copy of the data between the
+kernel and userspace. This is particularly useful for high-speed
+devices which produce several megabytes or even gigabytes of data per
+second.
+
+The data in this DMABUF interface is managed at the granularity of
+DMABUF objects. Reducing the granularity from byte level to block level
+is done to reduce the userspace-kernelspace synchronization overhead
+since performing syscalls for each byte at a few Mbps is just not
+feasible.
+
+This of course leads to a slightly increased latency. For this reason an
+application can choose the size of the DMABUFs as well as how many it
+allocates. E.g. two DMABUFs would be a traditional double buffering
+scheme. But using a higher number might be necessary to avoid
+underflow/overflow situations in the presence of scheduling latencies.
+
+2. User API
+===========
+
+``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)``
+----------------------------------------------------------------
+
+Each call will allocate a new DMABUF object. The return value (if not
+a negative errno value as error) will be the file descriptor of the new
+DMABUF.
+
+``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)``
+--------------------------------------------------------
+
+Place the DMABUF object into the queue pending for hardware process.
+
+These two IOCTLs have to be performed on the IIO buffer's file
+descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl.
+
+3. Usage
+========
+
+To access the data stored in a block by userspace the block must be
+mapped to the process's memory. This is done by calling mmap() on the
+DMABUF's file descriptor.
+
+Before accessing the data through the map, you must use the
+DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
+DMA_BUF_SYNC_START flag, to make sure that the data is available.
+This call may block until the hardware is done with this block. Once
+you are done reading or writing the data, you must use this ioctl again
+with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
+kernel's queue.
+
+If you need to know when the hardware is done with a DMABUF, you can
+poll its file descriptor for the EPOLLOUT event.
+
+Finally, to destroy a DMABUF object, simply call close() on its file
+descriptor.
+
+For more information about manipulating DMABUF objects, see: :ref:`dma-buf`.
+
+A typical workflow for the new interface is:
+
+    for block in blocks:
+      DMABUF_ALLOC block
+      mmap block
+
+    enable buffer
+
+    while !done
+      for block in blocks:
+        DMABUF_ENQUEUE block
+
+        DMABUF_SYNC_START block
+        process data
+        DMABUF_SYNC_END block
+
+    disable buffer
+
+    for block in blocks:
+      close block
diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst
index 58b7a4ebac51..669deb67ddee 100644
--- a/Documentation/iio/index.rst
+++ b/Documentation/iio/index.rst
@@ -9,4 +9,6 @@ Industrial I/O
 
    iio_configfs
 
+   dmabuf_api
+
    ep93xx_adc
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
  2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
                   ` (10 preceding siblings ...)
  2022-02-07 13:01 ` [PATCH v2 11/12] iio: buffer-dmaengine: " Paul Cercueil
@ 2022-02-13 18:46 ` Jonathan Cameron
  2022-02-15 17:43   ` Paul Cercueil
  11 siblings, 1 reply; 41+ messages in thread
From: Jonathan Cameron @ 2022-02-13 18:46 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

On Mon,  7 Feb 2022 12:59:21 +0000
Paul Cercueil <paul@crapouillou.net> wrote:

> Hi Jonathan,
> 
> This is the V2 of my patchset that introduces a new userspace interface
> based on DMABUF objects to complement the fileio API, and adds write()
> support to the existing fileio API.

Hi Paul,

It's been a little while. Perhaps you could summarize the various view
points around the appropriateness of using DMABUF for this?
I appreciate it is a tricky topic to distil into a brief summary but
I know I would find it useful even if no one else does!

Thanks,

Jonathan

> 
> Changes since v1:
> 
> - the patches that were merged in v1 have been (obviously) dropped from
>   this patchset;
> - the patch that was setting the write-combine cache setting has been
>   dropped as well, as it was simply not useful.
> - [01/12]: 
>     * Only remove the outgoing queue, and keep the incoming queue, as we
>       want the buffer to start streaming data as soon as it is enabled.
>     * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the
>       same as IIO_BLOCK_STATE_DONE.
> - [02/12]:
>     * Fix block->state not being reset in
>       iio_dma_buffer_request_update() for output buffers.
>     * Only update block->bytes_used once and add a comment about why we
>       update it.
>     * Add a comment about why we're setting a different state for output
>       buffers in iio_dma_buffer_request_update()
>     * Remove useless cast to bool (!!) in iio_dma_buffer_io()
> - [05/12]:
>     Only allow the new IOCTLs on the buffer FD created with
>     IIO_BUFFER_GET_FD_IOCTL().
> - [12/12]:
>     * Explicitly state that the new interface is optional and is
>       not implemented by all drivers.
>     * The IOCTLs can now only be called on the buffer FD returned by
>       IIO_BUFFER_GET_FD_IOCTL.
>     * Move the page up a bit in the index since it is core stuff and not
>       driver-specific.
> 
> The patches not listed here have not been modified since v1.
> 
> Cheers,
> -Paul
> 
> Alexandru Ardelean (1):
>   iio: buffer-dma: split iio_dma_buffer_fileio_free() function
> 
> Paul Cercueil (11):
>   iio: buffer-dma: Get rid of outgoing queue
>   iio: buffer-dma: Enable buffer write support
>   iio: buffer-dmaengine: Support specifying buffer direction
>   iio: buffer-dmaengine: Enable write support
>   iio: core: Add new DMABUF interface infrastructure
>   iio: buffer-dma: Use DMABUFs instead of custom solution
>   iio: buffer-dma: Implement new DMABUF based userspace API
>   iio: buffer-dmaengine: Support new DMABUF based userspace API
>   iio: core: Add support for cyclic buffers
>   iio: buffer-dmaengine: Add support for cyclic buffers
>   Documentation: iio: Document high-speed DMABUF based API
> 
>  Documentation/driver-api/dma-buf.rst          |   2 +
>  Documentation/iio/dmabuf_api.rst              |  94 +++
>  Documentation/iio/index.rst                   |   2 +
>  drivers/iio/adc/adi-axi-adc.c                 |   3 +-
>  drivers/iio/buffer/industrialio-buffer-dma.c  | 610 ++++++++++++++----
>  .../buffer/industrialio-buffer-dmaengine.c    |  42 +-
>  drivers/iio/industrialio-buffer.c             |  60 ++
>  include/linux/iio/buffer-dma.h                |  38 +-
>  include/linux/iio/buffer-dmaengine.h          |   5 +-
>  include/linux/iio/buffer_impl.h               |   8 +
>  include/uapi/linux/iio/buffer.h               |  30 +
>  11 files changed, 749 insertions(+), 145 deletions(-)
>  create mode 100644 Documentation/iio/dmabuf_api.rst
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue
  2022-02-07 12:59 ` [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue Paul Cercueil
@ 2022-02-13 18:57   ` Jonathan Cameron
  2022-02-13 19:25     ` Paul Cercueil
  2022-03-28 17:17   ` Jonathan Cameron
  1 sibling, 1 reply; 41+ messages in thread
From: Jonathan Cameron @ 2022-02-13 18:57 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

On Mon,  7 Feb 2022 12:59:22 +0000
Paul Cercueil <paul@crapouillou.net> wrote:

> The buffer-dma code was using two queues, incoming and outgoing, to
> manage the state of the blocks in use.
> 
> While this totally works, it adds some complexity to the code,
> especially since the code only manages 2 blocks. It is much easier to
> just check each block's state manually, and keep a counter for the next
> block to dequeue.
> 
> Since the new DMABUF based API wouldn't use the outgoing queue anyway,
> getting rid of it now makes the upcoming changes simpler.
> 
> With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can
> be removed.
> 
> v2: - Only remove the outgoing queue, and keep the incoming queue, as we
>       want the buffer to start streaming data as soon as it is enabled.
>     - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the
>       same as IIO_BLOCK_STATE_DONE.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> ---

Trivial process thing but change log should be here, not above as we don't
want it to end up in the main git log.


>  drivers/iio/buffer/industrialio-buffer-dma.c | 44 ++++++++++----------
>  include/linux/iio/buffer-dma.h               |  7 ++--
>  2 files changed, 26 insertions(+), 25 deletions(-)
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue
  2022-02-13 18:57   ` Jonathan Cameron
@ 2022-02-13 19:25     ` Paul Cercueil
  0 siblings, 0 replies; 41+ messages in thread
From: Paul Cercueil @ 2022-02-13 19:25 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

Hi Jonathan,

Le dim., févr. 13 2022 at 18:57:40 +0000, Jonathan Cameron 
<jic23@kernel.org> a écrit :
> On Mon,  7 Feb 2022 12:59:22 +0000
> Paul Cercueil <paul@crapouillou.net> wrote:
> 
>>  The buffer-dma code was using two queues, incoming and outgoing, to
>>  manage the state of the blocks in use.
>> 
>>  While this totally works, it adds some complexity to the code,
>>  especially since the code only manages 2 blocks. It is much easier 
>> to
>>  just check each block's state manually, and keep a counter for the 
>> next
>>  block to dequeue.
>> 
>>  Since the new DMABUF based API wouldn't use the outgoing queue 
>> anyway,
>>  getting rid of it now makes the upcoming changes simpler.
>> 
>>  With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and 
>> can
>>  be removed.
>> 
>>  v2: - Only remove the outgoing queue, and keep the incoming queue, 
>> as we
>>        want the buffer to start streaming data as soon as it is 
>> enabled.
>>      - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally 
>> the
>>        same as IIO_BLOCK_STATE_DONE.
>> 
>>  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
>>  ---
> 
> Trivial process thing but change log should be here, not above as we 
> don't
> want it to end up in the main git log.

I'm kinda used to do this now, it's the policy for sending patches to 
the DRM tree. I like it because "git notes" disappear after rebases and 
it's a pain. At least like this I don't lose the changelog.

But okay, I'll change it for v3, if there's a v3.

Cheers,
-Paul

>>   drivers/iio/buffer/industrialio-buffer-dma.c | 44 
>> ++++++++++----------
>>   include/linux/iio/buffer-dma.h               |  7 ++--
>>   2 files changed, 26 insertions(+), 25 deletions(-)
>> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
  2022-02-13 18:46 ` [PATCH v2 00/12] iio: buffer-dma: write() and new " Jonathan Cameron
@ 2022-02-15 17:43   ` Paul Cercueil
  2022-03-29  8:33     ` Daniel Vetter
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-02-15 17:43 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

Hi Jonathan,

Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron 
<jic23@kernel.org> a écrit :
> On Mon,  7 Feb 2022 12:59:21 +0000
> Paul Cercueil <paul@crapouillou.net> wrote:
> 
>>  Hi Jonathan,
>> 
>>  This is the V2 of my patchset that introduces a new userspace 
>> interface
>>  based on DMABUF objects to complement the fileio API, and adds 
>> write()
>>  support to the existing fileio API.
> 
> Hi Paul,
> 
> It's been a little while. Perhaps you could summarize the various view
> points around the appropriateness of using DMABUF for this?
> I appreciate it is a tricky topic to distil into a brief summary but
> I know I would find it useful even if no one else does!

So we want to have a high-speed interface where buffers of samples are 
passed around between IIO devices and other devices (e.g. USB or 
network), or made available to userspace without copying the data.

DMABUF is, at least in theory, exactly what we need. Quoting the 
documentation 
(https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
"The dma-buf subsystem provides the framework for sharing buffers for 
hardware (DMA) access across multiple device drivers and subsystems, 
and for synchronizing asynchronous hardware access. This is used, for 
example, by drm “prime” multi-GPU support, but is of course not 
limited to GPU use cases."

The problem is that right now DMABUF is only really used by DRM, and to 
quote Daniel, "dma-buf looks like something super generic and useful, 
until you realize that there's a metric ton of gpu/accelerator bagage 
piled in".

Still, it seems to be the only viable option. We could add a custom 
buffer-passing interface, but that would mean implementing the same 
buffer-passing interface on the network and USB stacks, and before we 
know it we re-invented DMABUFs.

Cheers,
-Paul


>> 
>>  Changes since v1:
>> 
>>  - the patches that were merged in v1 have been (obviously) dropped 
>> from
>>    this patchset;
>>  - the patch that was setting the write-combine cache setting has 
>> been
>>    dropped as well, as it was simply not useful.
>>  - [01/12]:
>>      * Only remove the outgoing queue, and keep the incoming queue, 
>> as we
>>        want the buffer to start streaming data as soon as it is 
>> enabled.
>>      * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally 
>> the
>>        same as IIO_BLOCK_STATE_DONE.
>>  - [02/12]:
>>      * Fix block->state not being reset in
>>        iio_dma_buffer_request_update() for output buffers.
>>      * Only update block->bytes_used once and add a comment about 
>> why we
>>        update it.
>>      * Add a comment about why we're setting a different state for 
>> output
>>        buffers in iio_dma_buffer_request_update()
>>      * Remove useless cast to bool (!!) in iio_dma_buffer_io()
>>  - [05/12]:
>>      Only allow the new IOCTLs on the buffer FD created with
>>      IIO_BUFFER_GET_FD_IOCTL().
>>  - [12/12]:
>>      * Explicitly state that the new interface is optional and is
>>        not implemented by all drivers.
>>      * The IOCTLs can now only be called on the buffer FD returned by
>>        IIO_BUFFER_GET_FD_IOCTL.
>>      * Move the page up a bit in the index since it is core stuff 
>> and not
>>        driver-specific.
>> 
>>  The patches not listed here have not been modified since v1.
>> 
>>  Cheers,
>>  -Paul
>> 
>>  Alexandru Ardelean (1):
>>    iio: buffer-dma: split iio_dma_buffer_fileio_free() function
>> 
>>  Paul Cercueil (11):
>>    iio: buffer-dma: Get rid of outgoing queue
>>    iio: buffer-dma: Enable buffer write support
>>    iio: buffer-dmaengine: Support specifying buffer direction
>>    iio: buffer-dmaengine: Enable write support
>>    iio: core: Add new DMABUF interface infrastructure
>>    iio: buffer-dma: Use DMABUFs instead of custom solution
>>    iio: buffer-dma: Implement new DMABUF based userspace API
>>    iio: buffer-dmaengine: Support new DMABUF based userspace API
>>    iio: core: Add support for cyclic buffers
>>    iio: buffer-dmaengine: Add support for cyclic buffers
>>    Documentation: iio: Document high-speed DMABUF based API
>> 
>>   Documentation/driver-api/dma-buf.rst          |   2 +
>>   Documentation/iio/dmabuf_api.rst              |  94 +++
>>   Documentation/iio/index.rst                   |   2 +
>>   drivers/iio/adc/adi-axi-adc.c                 |   3 +-
>>   drivers/iio/buffer/industrialio-buffer-dma.c  | 610 
>> ++++++++++++++----
>>   .../buffer/industrialio-buffer-dmaengine.c    |  42 +-
>>   drivers/iio/industrialio-buffer.c             |  60 ++
>>   include/linux/iio/buffer-dma.h                |  38 +-
>>   include/linux/iio/buffer-dmaengine.h          |   5 +-
>>   include/linux/iio/buffer_impl.h               |   8 +
>>   include/uapi/linux/iio/buffer.h               |  30 +
>>   11 files changed, 749 insertions(+), 145 deletions(-)
>>   create mode 100644 Documentation/iio/dmabuf_api.rst
>> 
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue
  2022-02-07 12:59 ` [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue Paul Cercueil
  2022-02-13 18:57   ` Jonathan Cameron
@ 2022-03-28 17:17   ` Jonathan Cameron
  1 sibling, 0 replies; 41+ messages in thread
From: Jonathan Cameron @ 2022-03-28 17:17 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

On Mon,  7 Feb 2022 12:59:22 +0000
Paul Cercueil <paul@crapouillou.net> wrote:

> The buffer-dma code was using two queues, incoming and outgoing, to
> manage the state of the blocks in use.
> 
> While this totally works, it adds some complexity to the code,
> especially since the code only manages 2 blocks. It is much easier to
> just check each block's state manually, and keep a counter for the next
> block to dequeue.
> 
> Since the new DMABUF based API wouldn't use the outgoing queue anyway,
> getting rid of it now makes the upcoming changes simpler.
> 
> With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can
> be removed.
> 
> v2: - Only remove the outgoing queue, and keep the incoming queue, as we
>       want the buffer to start streaming data as soon as it is enabled.
>     - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the
>       same as IIO_BLOCK_STATE_DONE.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>

Hi Paul,

In the interests of moving things forward / simplifying what people need
to look at: This change looks good to me on it's own.

Lars had some comments on v1.  Lars, could you take look at this and
verify if this versions addresses the points you raised (I think it does
but they were your comments so better you judge)

Thanks,

Jonathan

> ---
>  drivers/iio/buffer/industrialio-buffer-dma.c | 44 ++++++++++----------
>  include/linux/iio/buffer-dma.h               |  7 ++--
>  2 files changed, 26 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
> index d348af8b9705..1fc91467d1aa 100644
> --- a/drivers/iio/buffer/industrialio-buffer-dma.c
> +++ b/drivers/iio/buffer/industrialio-buffer-dma.c
> @@ -179,7 +179,7 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
>  	}
>  
>  	block->size = size;
> -	block->state = IIO_BLOCK_STATE_DEQUEUED;
> +	block->state = IIO_BLOCK_STATE_DONE;
>  	block->queue = queue;
>  	INIT_LIST_HEAD(&block->head);
>  	kref_init(&block->kref);
> @@ -191,16 +191,8 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
>  
>  static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
>  {
> -	struct iio_dma_buffer_queue *queue = block->queue;
> -
> -	/*
> -	 * The buffer has already been freed by the application, just drop the
> -	 * reference.
> -	 */
> -	if (block->state != IIO_BLOCK_STATE_DEAD) {
> +	if (block->state != IIO_BLOCK_STATE_DEAD)
>  		block->state = IIO_BLOCK_STATE_DONE;
> -		list_add_tail(&block->head, &queue->outgoing);
> -	}
>  }
>  
>  /**
> @@ -261,7 +253,6 @@ static bool iio_dma_block_reusable(struct iio_dma_buffer_block *block)
>  	 * not support abort and has not given back the block yet.
>  	 */
>  	switch (block->state) {
> -	case IIO_BLOCK_STATE_DEQUEUED:
>  	case IIO_BLOCK_STATE_QUEUED:
>  	case IIO_BLOCK_STATE_DONE:
>  		return true;
> @@ -317,7 +308,6 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
>  	 * dead. This means we can reset the lists without having to fear
>  	 * corrution.
>  	 */
> -	INIT_LIST_HEAD(&queue->outgoing);
>  	spin_unlock_irq(&queue->list_lock);
>  
>  	INIT_LIST_HEAD(&queue->incoming);
> @@ -456,14 +446,20 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue(
>  	struct iio_dma_buffer_queue *queue)
>  {
>  	struct iio_dma_buffer_block *block;
> +	unsigned int idx;
>  
>  	spin_lock_irq(&queue->list_lock);
> -	block = list_first_entry_or_null(&queue->outgoing, struct
> -		iio_dma_buffer_block, head);
> -	if (block != NULL) {
> -		list_del(&block->head);
> -		block->state = IIO_BLOCK_STATE_DEQUEUED;
> +
> +	idx = queue->fileio.next_dequeue;
> +	block = queue->fileio.blocks[idx];
> +
> +	if (block->state == IIO_BLOCK_STATE_DONE) {
> +		idx = (idx + 1) % ARRAY_SIZE(queue->fileio.blocks);
> +		queue->fileio.next_dequeue = idx;
> +	} else {
> +		block = NULL;
>  	}
> +
>  	spin_unlock_irq(&queue->list_lock);
>  
>  	return block;
> @@ -539,6 +535,7 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf)
>  	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buf);
>  	struct iio_dma_buffer_block *block;
>  	size_t data_available = 0;
> +	unsigned int i;
>  
>  	/*
>  	 * For counting the available bytes we'll use the size of the block not
> @@ -552,8 +549,15 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf)
>  		data_available += queue->fileio.active_block->size;
>  
>  	spin_lock_irq(&queue->list_lock);
> -	list_for_each_entry(block, &queue->outgoing, head)
> -		data_available += block->size;
> +
> +	for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
> +		block = queue->fileio.blocks[i];
> +
> +		if (block != queue->fileio.active_block
> +		    && block->state == IIO_BLOCK_STATE_DONE)
> +			data_available += block->size;
> +	}
> +
>  	spin_unlock_irq(&queue->list_lock);
>  	mutex_unlock(&queue->lock);
>  
> @@ -617,7 +621,6 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue,
>  	queue->ops = ops;
>  
>  	INIT_LIST_HEAD(&queue->incoming);
> -	INIT_LIST_HEAD(&queue->outgoing);
>  
>  	mutex_init(&queue->lock);
>  	spin_lock_init(&queue->list_lock);
> @@ -645,7 +648,6 @@ void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue)
>  			continue;
>  		queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD;
>  	}
> -	INIT_LIST_HEAD(&queue->outgoing);
>  	spin_unlock_irq(&queue->list_lock);
>  
>  	INIT_LIST_HEAD(&queue->incoming);
> diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
> index 6564bdcdac66..18d3702fa95d 100644
> --- a/include/linux/iio/buffer-dma.h
> +++ b/include/linux/iio/buffer-dma.h
> @@ -19,14 +19,12 @@ struct device;
>  
>  /**
>   * enum iio_block_state - State of a struct iio_dma_buffer_block
> - * @IIO_BLOCK_STATE_DEQUEUED: Block is not queued
>   * @IIO_BLOCK_STATE_QUEUED: Block is on the incoming queue
>   * @IIO_BLOCK_STATE_ACTIVE: Block is currently being processed by the DMA
>   * @IIO_BLOCK_STATE_DONE: Block is on the outgoing queue
>   * @IIO_BLOCK_STATE_DEAD: Block has been marked as to be freed
>   */
>  enum iio_block_state {
> -	IIO_BLOCK_STATE_DEQUEUED,
>  	IIO_BLOCK_STATE_QUEUED,
>  	IIO_BLOCK_STATE_ACTIVE,
>  	IIO_BLOCK_STATE_DONE,
> @@ -73,12 +71,15 @@ struct iio_dma_buffer_block {
>   * @active_block: Block being used in read()
>   * @pos: Read offset in the active block
>   * @block_size: Size of each block
> + * @next_dequeue: index of next block that will be dequeued
>   */
>  struct iio_dma_buffer_queue_fileio {
>  	struct iio_dma_buffer_block *blocks[2];
>  	struct iio_dma_buffer_block *active_block;
>  	size_t pos;
>  	size_t block_size;
> +
> +	unsigned int next_dequeue;
>  };
>  
>  /**
> @@ -93,7 +94,6 @@ struct iio_dma_buffer_queue_fileio {
>   *   list and typically also a list of active blocks in the part that handles
>   *   the DMA controller
>   * @incoming: List of buffers on the incoming queue
> - * @outgoing: List of buffers on the outgoing queue
>   * @active: Whether the buffer is currently active
>   * @fileio: FileIO state
>   */
> @@ -105,7 +105,6 @@ struct iio_dma_buffer_queue {
>  	struct mutex lock;
>  	spinlock_t list_lock;
>  	struct list_head incoming;
> -	struct list_head outgoing;
>  
>  	bool active;
>  


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support
  2022-02-07 12:59 ` [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support Paul Cercueil
@ 2022-03-28 17:24   ` Jonathan Cameron
  2022-03-28 18:39     ` Paul Cercueil
  2022-03-28 20:38   ` Andy Shevchenko
  1 sibling, 1 reply; 41+ messages in thread
From: Jonathan Cameron @ 2022-03-28 17:24 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

On Mon,  7 Feb 2022 12:59:23 +0000
Paul Cercueil <paul@crapouillou.net> wrote:

> Adding write support to the buffer-dma code is easy - the write()
> function basically needs to do the exact same thing as the read()
> function: dequeue a block, read or write the data, enqueue the block
> when entirely processed.
> 
> Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write()
> now both call a function iio_buffer_dma_io(), which will perform this
> task.
> 
> The .space_available() callback can return the exact same value as the
> .data_available() callback for input buffers, since in both cases we
> count the exact same thing (the number of bytes in each available
> block).
> 
> Note that we preemptively reset block->bytes_used to the buffer's size
> in iio_dma_buffer_request_update(), as in the future the
> iio_dma_buffer_enqueue() function won't reset it.
> 
> v2: - Fix block->state not being reset in
>       iio_dma_buffer_request_update() for output buffers.
>     - Only update block->bytes_used once and add a comment about why we
>       update it.
>     - Add a comment about why we're setting a different state for output
>       buffers in iio_dma_buffer_request_update()
>     - Remove useless cast to bool (!!) in iio_dma_buffer_io()
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
One comment inline.

I'd be tempted to queue this up with that fixed, but do we have
any users?  Even though it's trivial I'm not that keen on code
upstream well in advance of it being used.

Thanks,

Jonathan

> ---
>  drivers/iio/buffer/industrialio-buffer-dma.c | 88 ++++++++++++++++----
>  include/linux/iio/buffer-dma.h               |  7 ++
>  2 files changed, 79 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
> index 1fc91467d1aa..a9f1b673374f 100644
> --- a/drivers/iio/buffer/industrialio-buffer-dma.c
> +++ b/drivers/iio/buffer/industrialio-buffer-dma.c
> @@ -195,6 +195,18 @@ static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
>  		block->state = IIO_BLOCK_STATE_DONE;
>  }
>  
> +static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue *queue)
> +{
> +	__poll_t flags;
> +
> +	if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN)
> +		flags = EPOLLIN | EPOLLRDNORM;
> +	else
> +		flags = EPOLLOUT | EPOLLWRNORM;
> +
> +	wake_up_interruptible_poll(&queue->buffer.pollq, flags);
> +}
> +
>  /**
>   * iio_dma_buffer_block_done() - Indicate that a block has been completed
>   * @block: The completed block
> @@ -212,7 +224,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
>  	spin_unlock_irqrestore(&queue->list_lock, flags);
>  
>  	iio_buffer_block_put_atomic(block);
> -	wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM);
> +	iio_dma_buffer_queue_wake(queue);
>  }
>  EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
>  
> @@ -241,7 +253,7 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue,
>  	}
>  	spin_unlock_irqrestore(&queue->list_lock, flags);
>  
> -	wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM);
> +	iio_dma_buffer_queue_wake(queue);
>  }
>  EXPORT_SYMBOL_GPL(iio_dma_buffer_block_list_abort);
>  
> @@ -335,8 +347,24 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
>  			queue->fileio.blocks[i] = block;
>  		}
>  
> -		block->state = IIO_BLOCK_STATE_QUEUED;
> -		list_add_tail(&block->head, &queue->incoming);
> +		/*
> +		 * block->bytes_used may have been modified previously, e.g. by
> +		 * iio_dma_buffer_block_list_abort(). Reset it here to the
> +		 * block's so that iio_dma_buffer_io() will work.
> +		 */
> +		block->bytes_used = block->size;
> +
> +		/*
> +		 * If it's an input buffer, mark the block as queued, and
> +		 * iio_dma_buffer_enable() will submit it. Otherwise mark it as
> +		 * done, which means it's ready to be dequeued.
> +		 */
> +		if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
> +			block->state = IIO_BLOCK_STATE_QUEUED;
> +			list_add_tail(&block->head, &queue->incoming);
> +		} else {
> +			block->state = IIO_BLOCK_STATE_DONE;
> +		}
>  	}
>  
>  out_unlock:
> @@ -465,20 +493,12 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue(
>  	return block;
>  }
>  
> -/**
> - * iio_dma_buffer_read() - DMA buffer read callback
> - * @buffer: Buffer to read form
> - * @n: Number of bytes to read
> - * @user_buffer: Userspace buffer to copy the data to
> - *
> - * Should be used as the read callback for iio_buffer_access_ops
> - * struct for DMA buffers.
> - */
> -int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
> -	char __user *user_buffer)
> +static int iio_dma_buffer_io(struct iio_buffer *buffer,
> +			     size_t n, char __user *user_buffer, bool is_write)
>  {
>  	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
>  	struct iio_dma_buffer_block *block;
> +	void *addr;
>  	int ret;
>  
>  	if (n < buffer->bytes_per_datum)
> @@ -501,8 +521,13 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
>  	n = rounddown(n, buffer->bytes_per_datum);
>  	if (n > block->bytes_used - queue->fileio.pos)
>  		n = block->bytes_used - queue->fileio.pos;
> +	addr = block->vaddr + queue->fileio.pos;
>  
> -	if (copy_to_user(user_buffer, block->vaddr + queue->fileio.pos, n)) {
> +	if (is_write)
> +		ret = copy_from_user(addr, user_buffer, n);
> +	else
> +		ret = copy_to_user(user_buffer, addr, n);
> +	if (ret) {
>  		ret = -EFAULT;
>  		goto out_unlock;
>  	}
> @@ -521,8 +546,39 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
>  
>  	return ret;
>  }
> +
> +/**
> + * iio_dma_buffer_read() - DMA buffer read callback
> + * @buffer: Buffer to read form
> + * @n: Number of bytes to read
> + * @user_buffer: Userspace buffer to copy the data to
> + *
> + * Should be used as the read callback for iio_buffer_access_ops
> + * struct for DMA buffers.
> + */
> +int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
> +	char __user *user_buffer)
> +{
> +	return iio_dma_buffer_io(buffer, n, user_buffer, false);
> +}
>  EXPORT_SYMBOL_GPL(iio_dma_buffer_read);
>  
> +/**
> + * iio_dma_buffer_write() - DMA buffer write callback
> + * @buffer: Buffer to read form
> + * @n: Number of bytes to read
> + * @user_buffer: Userspace buffer to copy the data from
> + *
> + * Should be used as the write callback for iio_buffer_access_ops
> + * struct for DMA buffers.
> + */
> +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
> +			 const char __user *user_buffer)
> +{
> +	return iio_dma_buffer_io(buffer, n, (__force char *)user_buffer, true);

Casting away the const is a little nasty.   Perhaps it's worth adding a
parameter to iio_dma_buffer_io so you can have different parameters
for the read and write cases and hence keep the const in place?
return iio_dma_buffer_io(buffer, n, NULL, user_buffer, true);
and
return iio_dma_buffer_io(buffer,n, user_buffer, NULL, false);

> +}
> +EXPORT_SYMBOL_GPL(iio_dma_buffer_write);
> +
>  /**
>   * iio_dma_buffer_data_available() - DMA buffer data_available callback
>   * @buf: Buffer to check for data availability
> diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
> index 18d3702fa95d..490b93f76fa8 100644
> --- a/include/linux/iio/buffer-dma.h
> +++ b/include/linux/iio/buffer-dma.h
> @@ -132,6 +132,8 @@ int iio_dma_buffer_disable(struct iio_buffer *buffer,
>  	struct iio_dev *indio_dev);
>  int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
>  	char __user *user_buffer);
> +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
> +			 const char __user *user_buffer);
>  size_t iio_dma_buffer_data_available(struct iio_buffer *buffer);
>  int iio_dma_buffer_set_bytes_per_datum(struct iio_buffer *buffer, size_t bpd);
>  int iio_dma_buffer_set_length(struct iio_buffer *buffer, unsigned int length);
> @@ -142,4 +144,9 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue,
>  void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue);
>  void iio_dma_buffer_release(struct iio_dma_buffer_queue *queue);
>  
> +static inline size_t iio_dma_buffer_space_available(struct iio_buffer *buffer)
> +{
> +	return iio_dma_buffer_data_available(buffer);
> +}
> +
>  #endif


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 04/12] iio: buffer-dmaengine: Enable write support
  2022-02-07 12:59 ` [PATCH v2 04/12] iio: buffer-dmaengine: Enable write support Paul Cercueil
@ 2022-03-28 17:28   ` Jonathan Cameron
  0 siblings, 0 replies; 41+ messages in thread
From: Jonathan Cameron @ 2022-03-28 17:28 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

On Mon,  7 Feb 2022 12:59:25 +0000
Paul Cercueil <paul@crapouillou.net> wrote:

> Use the iio_dma_buffer_write() and iio_dma_buffer_space_available()
> functions provided by the buffer-dma core, to enable write support in
> the buffer-dmaengine code.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
This (and previous) look fine to me. Just that question of a user for
the new functionality...

Jonathan

> ---
>  drivers/iio/buffer/industrialio-buffer-dmaengine.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
> index ac26b04aa4a9..5cde8fd81c7f 100644
> --- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
> +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
> @@ -123,12 +123,14 @@ static void iio_dmaengine_buffer_release(struct iio_buffer *buf)
>  
>  static const struct iio_buffer_access_funcs iio_dmaengine_buffer_ops = {
>  	.read = iio_dma_buffer_read,
> +	.write = iio_dma_buffer_write,
>  	.set_bytes_per_datum = iio_dma_buffer_set_bytes_per_datum,
>  	.set_length = iio_dma_buffer_set_length,
>  	.request_update = iio_dma_buffer_request_update,
>  	.enable = iio_dma_buffer_enable,
>  	.disable = iio_dma_buffer_disable,
>  	.data_available = iio_dma_buffer_data_available,
> +	.space_available = iio_dma_buffer_space_available,
>  	.release = iio_dmaengine_buffer_release,
>  
>  	.modes = INDIO_BUFFER_HARDWARE,


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure
  2022-02-07 12:59 ` [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure Paul Cercueil
@ 2022-03-28 17:37   ` Jonathan Cameron
  2022-03-28 18:44     ` Paul Cercueil
  2022-03-28 20:46   ` Andy Shevchenko
  1 sibling, 1 reply; 41+ messages in thread
From: Jonathan Cameron @ 2022-03-28 17:37 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

On Mon,  7 Feb 2022 12:59:26 +0000
Paul Cercueil <paul@crapouillou.net> wrote:

> Add the necessary infrastructure to the IIO core to support a new
> optional DMABUF based interface.
> 
> The advantage of this new DMABUF based interface vs. the read()
> interface, is that it avoids an extra copy of the data between the
> kernel and userspace. This is particularly userful for high-speed

useful

> devices which produce several megabytes or even gigabytes of data per
> second.
> 
> The data in this new DMABUF interface is managed at the granularity of
> DMABUF objects. Reducing the granularity from byte level to block level
> is done to reduce the userspace-kernelspace synchronization overhead
> since performing syscalls for each byte at a few Mbps is just not
> feasible.
> 
> This of course leads to a slightly increased latency. For this reason an
> application can choose the size of the DMABUFs as well as how many it
> allocates. E.g. two DMABUFs would be a traditional double buffering
> scheme. But using a higher number might be necessary to avoid
> underflow/overflow situations in the presence of scheduling latencies.
> 
> As part of the interface, 2 new IOCTLs have been added:
> 
> IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *):
>  Each call will allocate a new DMABUF object. The return value (if not
>  a negative errno value as error) will be the file descriptor of the new
>  DMABUF.
> 
> IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
>  Place the DMABUF object into the queue pending for hardware process.
> 
> These two IOCTLs have to be performed on the IIO buffer's file
> descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.

Just to check, do they work on the old deprecated chardev route? Normally
we can directly access the first buffer without the ioctl.

> 
> To access the data stored in a block by userspace the block must be
> mapped to the process's memory. This is done by calling mmap() on the
> DMABUF's file descriptor.
> 
> Before accessing the data through the map, you must use the
> DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
> DMA_BUF_SYNC_START flag, to make sure that the data is available.
> This call may block until the hardware is done with this block. Once
> you are done reading or writing the data, you must use this ioctl again
> with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
> kernel's queue.
> 
> If you need to know when the hardware is done with a DMABUF, you can
> poll its file descriptor for the EPOLLOUT event.
> 
> Finally, to destroy a DMABUF object, simply call close() on its file
> descriptor.
> 
> A typical workflow for the new interface is:
> 
>   for block in blocks:
>     DMABUF_ALLOC block
>     mmap block
> 
>   enable buffer
> 
>   while !done
>     for block in blocks:
>       DMABUF_ENQUEUE block
> 
>       DMABUF_SYNC_START block
>       process data
>       DMABUF_SYNC_END block
> 
>   disable buffer
> 
>   for block in blocks:
>     close block

Given my very limited knowledge of dma-buf, I'll leave commenting
on the flow to others who know if this looks 'standards' or not ;)

Code looks sane to me..

> 
> v2: Only allow the new IOCTLs on the buffer FD created with
>     IIO_BUFFER_GET_FD_IOCTL().
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> ---
>  drivers/iio/industrialio-buffer.c | 55 +++++++++++++++++++++++++++++++
>  include/linux/iio/buffer_impl.h   |  8 +++++
>  include/uapi/linux/iio/buffer.h   | 29 ++++++++++++++++
>  3 files changed, 92 insertions(+)
> 
> diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c
> index 94eb9f6cf128..72f333a519bc 100644
> --- a/drivers/iio/industrialio-buffer.c
> +++ b/drivers/iio/industrialio-buffer.c
> @@ -17,6 +17,7 @@
>  #include <linux/fs.h>
>  #include <linux/cdev.h>
>  #include <linux/slab.h>
> +#include <linux/mm.h>
>  #include <linux/poll.h>
>  #include <linux/sched/signal.h>
>  
> @@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct inode *inode, struct file *filep)
>  	return 0;
>  }
>  
> +static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
> +				     struct iio_dmabuf __user *user_buf)
> +{
> +	struct iio_dmabuf dmabuf;
> +
> +	if (!buffer->access->enqueue_dmabuf)
> +		return -EPERM;
> +
> +	if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf)))
> +		return -EFAULT;
> +
> +	if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
> +		return -EINVAL;
> +
> +	return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
> +}
> +
> +static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer,
> +				   struct iio_dmabuf_alloc_req __user *user_req)
> +{
> +	struct iio_dmabuf_alloc_req req;
> +
> +	if (!buffer->access->alloc_dmabuf)
> +		return -EPERM;
> +
> +	if (copy_from_user(&req, user_req, sizeof(req)))
> +		return -EFAULT;
> +
> +	if (req.resv)
> +		return -EINVAL;
> +
> +	return buffer->access->alloc_dmabuf(buffer, &req);
> +}
> +
> +static long iio_buffer_chrdev_ioctl(struct file *filp,
> +				    unsigned int cmd, unsigned long arg)
> +{
> +	struct iio_dev_buffer_pair *ib = filp->private_data;
> +	struct iio_buffer *buffer = ib->buffer;
> +	void __user *_arg = (void __user *)arg;
> +
> +	switch (cmd) {
> +	case IIO_BUFFER_DMABUF_ALLOC_IOCTL:
> +		return iio_buffer_alloc_dmabuf(buffer, _arg);
> +	case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL:
> +		/* TODO: support non-blocking enqueue operation */
> +		return iio_buffer_enqueue_dmabuf(buffer, _arg);
> +	default:
> +		return IIO_IOCTL_UNHANDLED;
> +	}
> +}
> +
>  static const struct file_operations iio_buffer_chrdev_fileops = {
>  	.owner = THIS_MODULE,
>  	.llseek = noop_llseek,
>  	.read = iio_buffer_read,
>  	.write = iio_buffer_write,
> +	.unlocked_ioctl = iio_buffer_chrdev_ioctl,
> +	.compat_ioctl = compat_ptr_ioctl,
>  	.poll = iio_buffer_poll,
>  	.release = iio_buffer_chrdev_release,
>  };
> diff --git a/include/linux/iio/buffer_impl.h b/include/linux/iio/buffer_impl.h
> index e2ca8ea23e19..728541bc2c63 100644
> --- a/include/linux/iio/buffer_impl.h
> +++ b/include/linux/iio/buffer_impl.h
> @@ -39,6 +39,9 @@ struct iio_buffer;
>   *                      device stops sampling. Calles are balanced with @enable.
>   * @release:		called when the last reference to the buffer is dropped,
>   *			should free all resources allocated by the buffer.
> + * @alloc_dmabuf:	called from userspace via ioctl to allocate one DMABUF.
> + * @enqueue_dmabuf:	called from userspace via ioctl to queue this DMABUF
> + *			object to this buffer. Requires a valid DMABUF fd.
>   * @modes:		Supported operating modes by this buffer type
>   * @flags:		A bitmask combination of INDIO_BUFFER_FLAG_*
>   *
> @@ -68,6 +71,11 @@ struct iio_buffer_access_funcs {
>  
>  	void (*release)(struct iio_buffer *buffer);
>  
> +	int (*alloc_dmabuf)(struct iio_buffer *buffer,
> +			    struct iio_dmabuf_alloc_req *req);
> +	int (*enqueue_dmabuf)(struct iio_buffer *buffer,
> +			      struct iio_dmabuf *block);
> +
>  	unsigned int modes;
>  	unsigned int flags;
>  };
> diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h
> index 13939032b3f6..e4621b926262 100644
> --- a/include/uapi/linux/iio/buffer.h
> +++ b/include/uapi/linux/iio/buffer.h
> @@ -5,6 +5,35 @@
>  #ifndef _UAPI_IIO_BUFFER_H_
>  #define _UAPI_IIO_BUFFER_H_
>  
> +#include <linux/types.h>
> +
> +#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS	0x00000000
> +
> +/**
> + * struct iio_dmabuf_alloc_req - Descriptor for allocating IIO DMABUFs
> + * @size:	the size of a single DMABUF
> + * @resv:	reserved
> + */
> +struct iio_dmabuf_alloc_req {
> +	__u64 size;
> +	__u64 resv;
> +};
> +
> +/**
> + * struct iio_dmabuf - Descriptor for a single IIO DMABUF object
> + * @fd:		file descriptor of the DMABUF object
> + * @flags:	one or more IIO_BUFFER_DMABUF_* flags
> + * @bytes_used:	number of bytes used in this DMABUF for the data transfer.
> + *		If zero, the full buffer is used.
> + */
> +struct iio_dmabuf {
> +	__u32 fd;
> +	__u32 flags;
> +	__u64 bytes_used;
> +};
> +
>  #define IIO_BUFFER_GET_FD_IOCTL			_IOWR('i', 0x91, int)
> +#define IIO_BUFFER_DMABUF_ALLOC_IOCTL		_IOW('i', 0x92, struct iio_dmabuf_alloc_req)
> +#define IIO_BUFFER_DMABUF_ENQUEUE_IOCTL		_IOW('i', 0x93, struct iio_dmabuf)
>  
>  #endif /* _UAPI_IIO_BUFFER_H_ */


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution
  2022-02-07 12:59 ` [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution Paul Cercueil
@ 2022-03-28 17:54   ` Jonathan Cameron
  2022-03-28 17:54     ` Christian König
  2022-03-28 19:16     ` Paul Cercueil
  0 siblings, 2 replies; 41+ messages in thread
From: Jonathan Cameron @ 2022-03-28 17:54 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

On Mon,  7 Feb 2022 12:59:28 +0000
Paul Cercueil <paul@crapouillou.net> wrote:

> Enhance the current fileio code by using DMABUF objects instead of
> custom buffers.
> 
> This adds more code than it removes, but:
> - a lot of the complexity can be dropped, e.g. custom kref and
>   iio_buffer_block_put_atomic() are not needed anymore;
> - it will be much easier to introduce an API to export these DMABUF
>   objects to userspace in a following patch.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Hi Paul,

I'm a bit rusty on dma mappings, but you seem to have
a mixture of streaming and coherent mappings going on in here.

Is it the case that the current code is using the coherent mappings
and a potential 'other user' of the dma buffer might need
streaming mappings?

Jonathan

> ---
>  drivers/iio/buffer/industrialio-buffer-dma.c | 192 ++++++++++++-------
>  include/linux/iio/buffer-dma.h               |   8 +-
>  2 files changed, 122 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
> index 15ea7bc3ac08..54e6000cd2ee 100644
> --- a/drivers/iio/buffer/industrialio-buffer-dma.c
> +++ b/drivers/iio/buffer/industrialio-buffer-dma.c
> @@ -14,6 +14,7 @@
>  #include <linux/poll.h>
>  #include <linux/iio/buffer_impl.h>
>  #include <linux/iio/buffer-dma.h>
> +#include <linux/dma-buf.h>
>  #include <linux/dma-mapping.h>
>  #include <linux/sizes.h>
>  
> @@ -90,103 +91,145 @@
>   * callback is called from within the custom callback.
>   */
>  
> -static void iio_buffer_block_release(struct kref *kref)
> -{
> -	struct iio_dma_buffer_block *block = container_of(kref,
> -		struct iio_dma_buffer_block, kref);
> -
> -	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
> -
> -	dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size),
> -					block->vaddr, block->phys_addr);
> -
> -	iio_buffer_put(&block->queue->buffer);
> -	kfree(block);
> -}
> -
> -static void iio_buffer_block_get(struct iio_dma_buffer_block *block)
> -{
> -	kref_get(&block->kref);
> -}
> -
> -static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
> -{
> -	kref_put(&block->kref, iio_buffer_block_release);
> -}
> -
> -/*
> - * dma_free_coherent can sleep, hence we need to take some special care to be
> - * able to drop a reference from an atomic context.
> - */
> -static LIST_HEAD(iio_dma_buffer_dead_blocks);
> -static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock);
> -
> -static void iio_dma_buffer_cleanup_worker(struct work_struct *work)
> -{
> -	struct iio_dma_buffer_block *block, *_block;
> -	LIST_HEAD(block_list);
> -
> -	spin_lock_irq(&iio_dma_buffer_dead_blocks_lock);
> -	list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list);
> -	spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock);
> -
> -	list_for_each_entry_safe(block, _block, &block_list, head)
> -		iio_buffer_block_release(&block->kref);
> -}
> -static DECLARE_WORK(iio_dma_buffer_cleanup_work, iio_dma_buffer_cleanup_worker);
> -
> -static void iio_buffer_block_release_atomic(struct kref *kref)
> -{
> +struct iio_buffer_dma_buf_attachment {
> +	struct scatterlist sgl;
> +	struct sg_table sg_table;
>  	struct iio_dma_buffer_block *block;
> -	unsigned long flags;
> -
> -	block = container_of(kref, struct iio_dma_buffer_block, kref);
> -
> -	spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags);
> -	list_add_tail(&block->head, &iio_dma_buffer_dead_blocks);
> -	spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags);
> -
> -	schedule_work(&iio_dma_buffer_cleanup_work);
> -}
> -
> -/*
> - * Version of iio_buffer_block_put() that can be called from atomic context
> - */
> -static void iio_buffer_block_put_atomic(struct iio_dma_buffer_block *block)
> -{
> -	kref_put(&block->kref, iio_buffer_block_release_atomic);
> -}
> +};
>  
>  static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf)
>  {
>  	return container_of(buf, struct iio_dma_buffer_queue, buffer);
>  }
>  
> +static struct iio_buffer_dma_buf_attachment *
> +to_iio_buffer_dma_buf_attachment(struct sg_table *table)
> +{
> +	return container_of(table, struct iio_buffer_dma_buf_attachment, sg_table);
> +}
> +
> +static void iio_buffer_block_get(struct iio_dma_buffer_block *block)
> +{
> +	get_dma_buf(block->dmabuf);
> +}
> +
> +static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
> +{
> +	dma_buf_put(block->dmabuf);
> +}
> +
> +static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
> +				     struct dma_buf_attachment *at)
> +{
> +	at->priv = dbuf->priv;
> +
> +	return 0;
> +}
> +
> +static struct sg_table *iio_buffer_dma_buf_map(struct dma_buf_attachment *at,
> +					       enum dma_data_direction dma_dir)
> +{
> +	struct iio_dma_buffer_block *block = at->priv;
> +	struct iio_buffer_dma_buf_attachment *dba;
> +	int ret;
> +
> +	dba = kzalloc(sizeof(*dba), GFP_KERNEL);
> +	if (!dba)
> +		return ERR_PTR(-ENOMEM);
> +
> +	sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size));
> +	dba->sg_table.sgl = &dba->sgl;
> +	dba->sg_table.nents = 1;
> +	dba->block = block;
> +
> +	ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
> +	if (ret) {
> +		kfree(dba);
> +		return ERR_PTR(ret);
> +	}
> +
> +	return &dba->sg_table;
> +}
> +
> +static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
> +				     struct sg_table *sg_table,
> +				     enum dma_data_direction dma_dir)
> +{
> +	struct iio_buffer_dma_buf_attachment *dba =
> +		to_iio_buffer_dma_buf_attachment(sg_table);
> +
> +	dma_unmap_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
> +	kfree(dba);
> +}
> +
> +static void iio_buffer_dma_buf_release(struct dma_buf *dbuf)
> +{
> +	struct iio_dma_buffer_block *block = dbuf->priv;
> +	struct iio_dma_buffer_queue *queue = block->queue;
> +
> +	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
> +
> +	mutex_lock(&queue->lock);
> +
> +	dma_free_coherent(queue->dev, PAGE_ALIGN(block->size),
> +			  block->vaddr, block->phys_addr);
> +	kfree(block);
> +
> +	mutex_unlock(&queue->lock);
> +	iio_buffer_put(&queue->buffer);
> +}
> +
> +static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = {
> +	.attach			= iio_buffer_dma_buf_attach,
> +	.map_dma_buf		= iio_buffer_dma_buf_map,
> +	.unmap_dma_buf		= iio_buffer_dma_buf_unmap,
> +	.release		= iio_buffer_dma_buf_release,
> +};
> +
>  static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
>  	struct iio_dma_buffer_queue *queue, size_t size)
>  {
>  	struct iio_dma_buffer_block *block;
> +	DEFINE_DMA_BUF_EXPORT_INFO(einfo);
> +	struct dma_buf *dmabuf;
> +	int err = -ENOMEM;
>  
>  	block = kzalloc(sizeof(*block), GFP_KERNEL);
>  	if (!block)
> -		return NULL;
> +		return ERR_PTR(err);
>  
>  	block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size),
>  		&block->phys_addr, GFP_KERNEL);
> -	if (!block->vaddr) {
> -		kfree(block);
> -		return NULL;
> +	if (!block->vaddr)
> +		goto err_free_block;
> +
> +	einfo.ops = &iio_dma_buffer_dmabuf_ops;
> +	einfo.size = PAGE_ALIGN(size);
> +	einfo.priv = block;
> +	einfo.flags = O_RDWR;
> +
> +	dmabuf = dma_buf_export(&einfo);
> +	if (IS_ERR(dmabuf)) {
> +		err = PTR_ERR(dmabuf);
> +		goto err_free_dma;
>  	}
>  
> +	block->dmabuf = dmabuf;
>  	block->size = size;
>  	block->state = IIO_BLOCK_STATE_DONE;
>  	block->queue = queue;
>  	INIT_LIST_HEAD(&block->head);
> -	kref_init(&block->kref);
>  
>  	iio_buffer_get(&queue->buffer);
>  
>  	return block;
> +
> +err_free_dma:
> +	dma_free_coherent(queue->dev, PAGE_ALIGN(size),
> +			  block->vaddr, block->phys_addr);
> +err_free_block:
> +	kfree(block);
> +	return ERR_PTR(err);
>  }
>  
>  static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
> @@ -223,7 +266,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
>  	_iio_dma_buffer_block_done(block);
>  	spin_unlock_irqrestore(&queue->list_lock, flags);
>  
> -	iio_buffer_block_put_atomic(block);
> +	iio_buffer_block_put(block);
>  	iio_dma_buffer_queue_wake(queue);
>  }
>  EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
> @@ -249,7 +292,8 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue,
>  		list_del(&block->head);
>  		block->bytes_used = 0;
>  		_iio_dma_buffer_block_done(block);
> -		iio_buffer_block_put_atomic(block);
> +
> +		iio_buffer_block_put(block);
>  	}
>  	spin_unlock_irqrestore(&queue->list_lock, flags);
>  
> @@ -340,8 +384,8 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
>  
>  		if (!block) {
>  			block = iio_dma_buffer_alloc_block(queue, size);
> -			if (!block) {
> -				ret = -ENOMEM;
> +			if (IS_ERR(block)) {
> +				ret = PTR_ERR(block);
>  				goto out_unlock;
>  			}
>  			queue->fileio.blocks[i] = block;
> diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
> index 490b93f76fa8..6b3fa7d2124b 100644
> --- a/include/linux/iio/buffer-dma.h
> +++ b/include/linux/iio/buffer-dma.h
> @@ -8,7 +8,6 @@
>  #define __INDUSTRIALIO_DMA_BUFFER_H__
>  
>  #include <linux/list.h>
> -#include <linux/kref.h>
>  #include <linux/spinlock.h>
>  #include <linux/mutex.h>
>  #include <linux/iio/buffer_impl.h>
> @@ -16,6 +15,7 @@
>  struct iio_dma_buffer_queue;
>  struct iio_dma_buffer_ops;
>  struct device;
> +struct dma_buf;
>  
>  /**
>   * enum iio_block_state - State of a struct iio_dma_buffer_block
> @@ -39,8 +39,8 @@ enum iio_block_state {
>   * @vaddr: Virutal address of the blocks memory
>   * @phys_addr: Physical address of the blocks memory
>   * @queue: Parent DMA buffer queue
> - * @kref: kref used to manage the lifetime of block
>   * @state: Current state of the block
> + * @dmabuf: Underlying DMABUF object
>   */
>  struct iio_dma_buffer_block {
>  	/* May only be accessed by the owner of the block */
> @@ -56,13 +56,13 @@ struct iio_dma_buffer_block {
>  	size_t size;
>  	struct iio_dma_buffer_queue *queue;
>  
> -	/* Must not be accessed outside the core. */
> -	struct kref kref;
>  	/*
>  	 * Must not be accessed outside the core. Access needs to hold
>  	 * queue->list_lock if the block is not owned by the core.
>  	 */
>  	enum iio_block_state state;
> +
> +	struct dma_buf *dmabuf;
>  };
>  
>  /**


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution
  2022-03-28 17:54   ` Jonathan Cameron
@ 2022-03-28 17:54     ` Christian König
  2022-03-28 19:16     ` Paul Cercueil
  1 sibling, 0 replies; 41+ messages in thread
From: Christian König @ 2022-03-28 17:54 UTC (permalink / raw)
  To: Jonathan Cameron, Paul Cercueil
  Cc: Michael Hennerich, Lars-Peter Clausen, Sumit Semwal,
	Jonathan Corbet, Alexandru Ardelean, dri-devel, linaro-mm-sig,
	linux-doc, linux-kernel, linux-iio

Am 28.03.22 um 19:54 schrieb Jonathan Cameron:
> On Mon,  7 Feb 2022 12:59:28 +0000
> Paul Cercueil <paul@crapouillou.net> wrote:
>
>> Enhance the current fileio code by using DMABUF objects instead of
>> custom buffers.
>>
>> This adds more code than it removes, but:
>> - a lot of the complexity can be dropped, e.g. custom kref and
>>    iio_buffer_block_put_atomic() are not needed anymore;
>> - it will be much easier to introduce an API to export these DMABUF
>>    objects to userspace in a following patch.
>>
>> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> Hi Paul,
>
> I'm a bit rusty on dma mappings, but you seem to have
> a mixture of streaming and coherent mappings going on in here.
>
> Is it the case that the current code is using the coherent mappings
> and a potential 'other user' of the dma buffer might need
> streaming mappings?

Streaming mappings are generally not supported by DMA-buf.

You always have only coherent mappings.

Regards,
Christian.

>
> Jonathan
>
>> ---
>>   drivers/iio/buffer/industrialio-buffer-dma.c | 192 ++++++++++++-------
>>   include/linux/iio/buffer-dma.h               |   8 +-
>>   2 files changed, 122 insertions(+), 78 deletions(-)
>>
>> diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
>> index 15ea7bc3ac08..54e6000cd2ee 100644
>> --- a/drivers/iio/buffer/industrialio-buffer-dma.c
>> +++ b/drivers/iio/buffer/industrialio-buffer-dma.c
>> @@ -14,6 +14,7 @@
>>   #include <linux/poll.h>
>>   #include <linux/iio/buffer_impl.h>
>>   #include <linux/iio/buffer-dma.h>
>> +#include <linux/dma-buf.h>
>>   #include <linux/dma-mapping.h>
>>   #include <linux/sizes.h>
>>   
>> @@ -90,103 +91,145 @@
>>    * callback is called from within the custom callback.
>>    */
>>   
>> -static void iio_buffer_block_release(struct kref *kref)
>> -{
>> -	struct iio_dma_buffer_block *block = container_of(kref,
>> -		struct iio_dma_buffer_block, kref);
>> -
>> -	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
>> -
>> -	dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size),
>> -					block->vaddr, block->phys_addr);
>> -
>> -	iio_buffer_put(&block->queue->buffer);
>> -	kfree(block);
>> -}
>> -
>> -static void iio_buffer_block_get(struct iio_dma_buffer_block *block)
>> -{
>> -	kref_get(&block->kref);
>> -}
>> -
>> -static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
>> -{
>> -	kref_put(&block->kref, iio_buffer_block_release);
>> -}
>> -
>> -/*
>> - * dma_free_coherent can sleep, hence we need to take some special care to be
>> - * able to drop a reference from an atomic context.
>> - */
>> -static LIST_HEAD(iio_dma_buffer_dead_blocks);
>> -static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock);
>> -
>> -static void iio_dma_buffer_cleanup_worker(struct work_struct *work)
>> -{
>> -	struct iio_dma_buffer_block *block, *_block;
>> -	LIST_HEAD(block_list);
>> -
>> -	spin_lock_irq(&iio_dma_buffer_dead_blocks_lock);
>> -	list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list);
>> -	spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock);
>> -
>> -	list_for_each_entry_safe(block, _block, &block_list, head)
>> -		iio_buffer_block_release(&block->kref);
>> -}
>> -static DECLARE_WORK(iio_dma_buffer_cleanup_work, iio_dma_buffer_cleanup_worker);
>> -
>> -static void iio_buffer_block_release_atomic(struct kref *kref)
>> -{
>> +struct iio_buffer_dma_buf_attachment {
>> +	struct scatterlist sgl;
>> +	struct sg_table sg_table;
>>   	struct iio_dma_buffer_block *block;
>> -	unsigned long flags;
>> -
>> -	block = container_of(kref, struct iio_dma_buffer_block, kref);
>> -
>> -	spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags);
>> -	list_add_tail(&block->head, &iio_dma_buffer_dead_blocks);
>> -	spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags);
>> -
>> -	schedule_work(&iio_dma_buffer_cleanup_work);
>> -}
>> -
>> -/*
>> - * Version of iio_buffer_block_put() that can be called from atomic context
>> - */
>> -static void iio_buffer_block_put_atomic(struct iio_dma_buffer_block *block)
>> -{
>> -	kref_put(&block->kref, iio_buffer_block_release_atomic);
>> -}
>> +};
>>   
>>   static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf)
>>   {
>>   	return container_of(buf, struct iio_dma_buffer_queue, buffer);
>>   }
>>   
>> +static struct iio_buffer_dma_buf_attachment *
>> +to_iio_buffer_dma_buf_attachment(struct sg_table *table)
>> +{
>> +	return container_of(table, struct iio_buffer_dma_buf_attachment, sg_table);
>> +}
>> +
>> +static void iio_buffer_block_get(struct iio_dma_buffer_block *block)
>> +{
>> +	get_dma_buf(block->dmabuf);
>> +}
>> +
>> +static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
>> +{
>> +	dma_buf_put(block->dmabuf);
>> +}
>> +
>> +static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
>> +				     struct dma_buf_attachment *at)
>> +{
>> +	at->priv = dbuf->priv;
>> +
>> +	return 0;
>> +}
>> +
>> +static struct sg_table *iio_buffer_dma_buf_map(struct dma_buf_attachment *at,
>> +					       enum dma_data_direction dma_dir)
>> +{
>> +	struct iio_dma_buffer_block *block = at->priv;
>> +	struct iio_buffer_dma_buf_attachment *dba;
>> +	int ret;
>> +
>> +	dba = kzalloc(sizeof(*dba), GFP_KERNEL);
>> +	if (!dba)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size));
>> +	dba->sg_table.sgl = &dba->sgl;
>> +	dba->sg_table.nents = 1;
>> +	dba->block = block;
>> +
>> +	ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
>> +	if (ret) {
>> +		kfree(dba);
>> +		return ERR_PTR(ret);
>> +	}
>> +
>> +	return &dba->sg_table;
>> +}
>> +
>> +static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
>> +				     struct sg_table *sg_table,
>> +				     enum dma_data_direction dma_dir)
>> +{
>> +	struct iio_buffer_dma_buf_attachment *dba =
>> +		to_iio_buffer_dma_buf_attachment(sg_table);
>> +
>> +	dma_unmap_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
>> +	kfree(dba);
>> +}
>> +
>> +static void iio_buffer_dma_buf_release(struct dma_buf *dbuf)
>> +{
>> +	struct iio_dma_buffer_block *block = dbuf->priv;
>> +	struct iio_dma_buffer_queue *queue = block->queue;
>> +
>> +	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
>> +
>> +	mutex_lock(&queue->lock);
>> +
>> +	dma_free_coherent(queue->dev, PAGE_ALIGN(block->size),
>> +			  block->vaddr, block->phys_addr);
>> +	kfree(block);
>> +
>> +	mutex_unlock(&queue->lock);
>> +	iio_buffer_put(&queue->buffer);
>> +}
>> +
>> +static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = {
>> +	.attach			= iio_buffer_dma_buf_attach,
>> +	.map_dma_buf		= iio_buffer_dma_buf_map,
>> +	.unmap_dma_buf		= iio_buffer_dma_buf_unmap,
>> +	.release		= iio_buffer_dma_buf_release,
>> +};
>> +
>>   static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
>>   	struct iio_dma_buffer_queue *queue, size_t size)
>>   {
>>   	struct iio_dma_buffer_block *block;
>> +	DEFINE_DMA_BUF_EXPORT_INFO(einfo);
>> +	struct dma_buf *dmabuf;
>> +	int err = -ENOMEM;
>>   
>>   	block = kzalloc(sizeof(*block), GFP_KERNEL);
>>   	if (!block)
>> -		return NULL;
>> +		return ERR_PTR(err);
>>   
>>   	block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size),
>>   		&block->phys_addr, GFP_KERNEL);
>> -	if (!block->vaddr) {
>> -		kfree(block);
>> -		return NULL;
>> +	if (!block->vaddr)
>> +		goto err_free_block;
>> +
>> +	einfo.ops = &iio_dma_buffer_dmabuf_ops;
>> +	einfo.size = PAGE_ALIGN(size);
>> +	einfo.priv = block;
>> +	einfo.flags = O_RDWR;
>> +
>> +	dmabuf = dma_buf_export(&einfo);
>> +	if (IS_ERR(dmabuf)) {
>> +		err = PTR_ERR(dmabuf);
>> +		goto err_free_dma;
>>   	}
>>   
>> +	block->dmabuf = dmabuf;
>>   	block->size = size;
>>   	block->state = IIO_BLOCK_STATE_DONE;
>>   	block->queue = queue;
>>   	INIT_LIST_HEAD(&block->head);
>> -	kref_init(&block->kref);
>>   
>>   	iio_buffer_get(&queue->buffer);
>>   
>>   	return block;
>> +
>> +err_free_dma:
>> +	dma_free_coherent(queue->dev, PAGE_ALIGN(size),
>> +			  block->vaddr, block->phys_addr);
>> +err_free_block:
>> +	kfree(block);
>> +	return ERR_PTR(err);
>>   }
>>   
>>   static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
>> @@ -223,7 +266,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
>>   	_iio_dma_buffer_block_done(block);
>>   	spin_unlock_irqrestore(&queue->list_lock, flags);
>>   
>> -	iio_buffer_block_put_atomic(block);
>> +	iio_buffer_block_put(block);
>>   	iio_dma_buffer_queue_wake(queue);
>>   }
>>   EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
>> @@ -249,7 +292,8 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue,
>>   		list_del(&block->head);
>>   		block->bytes_used = 0;
>>   		_iio_dma_buffer_block_done(block);
>> -		iio_buffer_block_put_atomic(block);
>> +
>> +		iio_buffer_block_put(block);
>>   	}
>>   	spin_unlock_irqrestore(&queue->list_lock, flags);
>>   
>> @@ -340,8 +384,8 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
>>   
>>   		if (!block) {
>>   			block = iio_dma_buffer_alloc_block(queue, size);
>> -			if (!block) {
>> -				ret = -ENOMEM;
>> +			if (IS_ERR(block)) {
>> +				ret = PTR_ERR(block);
>>   				goto out_unlock;
>>   			}
>>   			queue->fileio.blocks[i] = block;
>> diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
>> index 490b93f76fa8..6b3fa7d2124b 100644
>> --- a/include/linux/iio/buffer-dma.h
>> +++ b/include/linux/iio/buffer-dma.h
>> @@ -8,7 +8,6 @@
>>   #define __INDUSTRIALIO_DMA_BUFFER_H__
>>   
>>   #include <linux/list.h>
>> -#include <linux/kref.h>
>>   #include <linux/spinlock.h>
>>   #include <linux/mutex.h>
>>   #include <linux/iio/buffer_impl.h>
>> @@ -16,6 +15,7 @@
>>   struct iio_dma_buffer_queue;
>>   struct iio_dma_buffer_ops;
>>   struct device;
>> +struct dma_buf;
>>   
>>   /**
>>    * enum iio_block_state - State of a struct iio_dma_buffer_block
>> @@ -39,8 +39,8 @@ enum iio_block_state {
>>    * @vaddr: Virutal address of the blocks memory
>>    * @phys_addr: Physical address of the blocks memory
>>    * @queue: Parent DMA buffer queue
>> - * @kref: kref used to manage the lifetime of block
>>    * @state: Current state of the block
>> + * @dmabuf: Underlying DMABUF object
>>    */
>>   struct iio_dma_buffer_block {
>>   	/* May only be accessed by the owner of the block */
>> @@ -56,13 +56,13 @@ struct iio_dma_buffer_block {
>>   	size_t size;
>>   	struct iio_dma_buffer_queue *queue;
>>   
>> -	/* Must not be accessed outside the core. */
>> -	struct kref kref;
>>   	/*
>>   	 * Must not be accessed outside the core. Access needs to hold
>>   	 * queue->list_lock if the block is not owned by the core.
>>   	 */
>>   	enum iio_block_state state;
>> +
>> +	struct dma_buf *dmabuf;
>>   };
>>   
>>   /**


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support
  2022-03-28 17:24   ` Jonathan Cameron
@ 2022-03-28 18:39     ` Paul Cercueil
  2022-03-29  7:11       ` Nuno Sá
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-03-28 18:39 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

Hi Jonathan,

Le lun., mars 28 2022 at 18:24:09 +0100, Jonathan Cameron 
<jic23@kernel.org> a écrit :
> On Mon,  7 Feb 2022 12:59:23 +0000
> Paul Cercueil <paul@crapouillou.net> wrote:
> 
>>  Adding write support to the buffer-dma code is easy - the write()
>>  function basically needs to do the exact same thing as the read()
>>  function: dequeue a block, read or write the data, enqueue the block
>>  when entirely processed.
>> 
>>  Therefore, the iio_buffer_dma_read() and the new 
>> iio_buffer_dma_write()
>>  now both call a function iio_buffer_dma_io(), which will perform 
>> this
>>  task.
>> 
>>  The .space_available() callback can return the exact same value as 
>> the
>>  .data_available() callback for input buffers, since in both cases we
>>  count the exact same thing (the number of bytes in each available
>>  block).
>> 
>>  Note that we preemptively reset block->bytes_used to the buffer's 
>> size
>>  in iio_dma_buffer_request_update(), as in the future the
>>  iio_dma_buffer_enqueue() function won't reset it.
>> 
>>  v2: - Fix block->state not being reset in
>>        iio_dma_buffer_request_update() for output buffers.
>>      - Only update block->bytes_used once and add a comment about 
>> why we
>>        update it.
>>      - Add a comment about why we're setting a different state for 
>> output
>>        buffers in iio_dma_buffer_request_update()
>>      - Remove useless cast to bool (!!) in iio_dma_buffer_io()
>> 
>>  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
>>  Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
> One comment inline.
> 
> I'd be tempted to queue this up with that fixed, but do we have
> any users?  Even though it's trivial I'm not that keen on code
> upstream well in advance of it being used.

There's a userspace user in libiio. On the kernel side we do have 
drivers that use it in ADI's downstream kernel, that we plan to 
upstream in the long term (but it can take some time, as we need to 
upstream other things first, like JESD204B support).

> 
>>  ---
>>   drivers/iio/buffer/industrialio-buffer-dma.c | 88 
>> ++++++++++++++++----
>>   include/linux/iio/buffer-dma.h               |  7 ++
>>   2 files changed, 79 insertions(+), 16 deletions(-)
>> 
>>  diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c 
>> b/drivers/iio/buffer/industrialio-buffer-dma.c
>>  index 1fc91467d1aa..a9f1b673374f 100644
>>  --- a/drivers/iio/buffer/industrialio-buffer-dma.c
>>  +++ b/drivers/iio/buffer/industrialio-buffer-dma.c
>>  @@ -195,6 +195,18 @@ static void _iio_dma_buffer_block_done(struct 
>> iio_dma_buffer_block *block)
>>   		block->state = IIO_BLOCK_STATE_DONE;
>>   }
>> 
>>  +static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue 
>> *queue)
>>  +{
>>  +	__poll_t flags;
>>  +
>>  +	if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN)
>>  +		flags = EPOLLIN | EPOLLRDNORM;
>>  +	else
>>  +		flags = EPOLLOUT | EPOLLWRNORM;
>>  +
>>  +	wake_up_interruptible_poll(&queue->buffer.pollq, flags);
>>  +}
>>  +
>>   /**
>>    * iio_dma_buffer_block_done() - Indicate that a block has been 
>> completed
>>    * @block: The completed block
>>  @@ -212,7 +224,7 @@ void iio_dma_buffer_block_done(struct 
>> iio_dma_buffer_block *block)
>>   	spin_unlock_irqrestore(&queue->list_lock, flags);
>> 
>>   	iio_buffer_block_put_atomic(block);
>>  -	wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | 
>> EPOLLRDNORM);
>>  +	iio_dma_buffer_queue_wake(queue);
>>   }
>>   EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
>> 
>>  @@ -241,7 +253,7 @@ void iio_dma_buffer_block_list_abort(struct 
>> iio_dma_buffer_queue *queue,
>>   	}
>>   	spin_unlock_irqrestore(&queue->list_lock, flags);
>> 
>>  -	wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | 
>> EPOLLRDNORM);
>>  +	iio_dma_buffer_queue_wake(queue);
>>   }
>>   EXPORT_SYMBOL_GPL(iio_dma_buffer_block_list_abort);
>> 
>>  @@ -335,8 +347,24 @@ int iio_dma_buffer_request_update(struct 
>> iio_buffer *buffer)
>>   			queue->fileio.blocks[i] = block;
>>   		}
>> 
>>  -		block->state = IIO_BLOCK_STATE_QUEUED;
>>  -		list_add_tail(&block->head, &queue->incoming);
>>  +		/*
>>  +		 * block->bytes_used may have been modified previously, e.g. by
>>  +		 * iio_dma_buffer_block_list_abort(). Reset it here to the
>>  +		 * block's so that iio_dma_buffer_io() will work.
>>  +		 */
>>  +		block->bytes_used = block->size;
>>  +
>>  +		/*
>>  +		 * If it's an input buffer, mark the block as queued, and
>>  +		 * iio_dma_buffer_enable() will submit it. Otherwise mark it as
>>  +		 * done, which means it's ready to be dequeued.
>>  +		 */
>>  +		if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
>>  +			block->state = IIO_BLOCK_STATE_QUEUED;
>>  +			list_add_tail(&block->head, &queue->incoming);
>>  +		} else {
>>  +			block->state = IIO_BLOCK_STATE_DONE;
>>  +		}
>>   	}
>> 
>>   out_unlock:
>>  @@ -465,20 +493,12 @@ static struct iio_dma_buffer_block 
>> *iio_dma_buffer_dequeue(
>>   	return block;
>>   }
>> 
>>  -/**
>>  - * iio_dma_buffer_read() - DMA buffer read callback
>>  - * @buffer: Buffer to read form
>>  - * @n: Number of bytes to read
>>  - * @user_buffer: Userspace buffer to copy the data to
>>  - *
>>  - * Should be used as the read callback for iio_buffer_access_ops
>>  - * struct for DMA buffers.
>>  - */
>>  -int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
>>  -	char __user *user_buffer)
>>  +static int iio_dma_buffer_io(struct iio_buffer *buffer,
>>  +			     size_t n, char __user *user_buffer, bool is_write)
>>   {
>>   	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
>>   	struct iio_dma_buffer_block *block;
>>  +	void *addr;
>>   	int ret;
>> 
>>   	if (n < buffer->bytes_per_datum)
>>  @@ -501,8 +521,13 @@ int iio_dma_buffer_read(struct iio_buffer 
>> *buffer, size_t n,
>>   	n = rounddown(n, buffer->bytes_per_datum);
>>   	if (n > block->bytes_used - queue->fileio.pos)
>>   		n = block->bytes_used - queue->fileio.pos;
>>  +	addr = block->vaddr + queue->fileio.pos;
>> 
>>  -	if (copy_to_user(user_buffer, block->vaddr + queue->fileio.pos, 
>> n)) {
>>  +	if (is_write)
>>  +		ret = copy_from_user(addr, user_buffer, n);
>>  +	else
>>  +		ret = copy_to_user(user_buffer, addr, n);
>>  +	if (ret) {
>>   		ret = -EFAULT;
>>   		goto out_unlock;
>>   	}
>>  @@ -521,8 +546,39 @@ int iio_dma_buffer_read(struct iio_buffer 
>> *buffer, size_t n,
>> 
>>   	return ret;
>>   }
>>  +
>>  +/**
>>  + * iio_dma_buffer_read() - DMA buffer read callback
>>  + * @buffer: Buffer to read form
>>  + * @n: Number of bytes to read
>>  + * @user_buffer: Userspace buffer to copy the data to
>>  + *
>>  + * Should be used as the read callback for iio_buffer_access_ops
>>  + * struct for DMA buffers.
>>  + */
>>  +int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
>>  +	char __user *user_buffer)
>>  +{
>>  +	return iio_dma_buffer_io(buffer, n, user_buffer, false);
>>  +}
>>   EXPORT_SYMBOL_GPL(iio_dma_buffer_read);
>> 
>>  +/**
>>  + * iio_dma_buffer_write() - DMA buffer write callback
>>  + * @buffer: Buffer to read form
>>  + * @n: Number of bytes to read
>>  + * @user_buffer: Userspace buffer to copy the data from
>>  + *
>>  + * Should be used as the write callback for iio_buffer_access_ops
>>  + * struct for DMA buffers.
>>  + */
>>  +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
>>  +			 const char __user *user_buffer)
>>  +{
>>  +	return iio_dma_buffer_io(buffer, n, (__force char *)user_buffer, 
>> true);
> 
> Casting away the const is a little nasty.   Perhaps it's worth adding 
> a
> parameter to iio_dma_buffer_io so you can have different parameters
> for the read and write cases and hence keep the const in place?
> return iio_dma_buffer_io(buffer, n, NULL, user_buffer, true);
> and
> return iio_dma_buffer_io(buffer,n, user_buffer, NULL, false);

I can do that.

Cheers,
-Paul

>>  +}
>>  +EXPORT_SYMBOL_GPL(iio_dma_buffer_write);
>>  +
>>   /**
>>    * iio_dma_buffer_data_available() - DMA buffer data_available 
>> callback
>>    * @buf: Buffer to check for data availability
>>  diff --git a/include/linux/iio/buffer-dma.h 
>> b/include/linux/iio/buffer-dma.h
>>  index 18d3702fa95d..490b93f76fa8 100644
>>  --- a/include/linux/iio/buffer-dma.h
>>  +++ b/include/linux/iio/buffer-dma.h
>>  @@ -132,6 +132,8 @@ int iio_dma_buffer_disable(struct iio_buffer 
>> *buffer,
>>   	struct iio_dev *indio_dev);
>>   int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
>>   	char __user *user_buffer);
>>  +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
>>  +			 const char __user *user_buffer);
>>   size_t iio_dma_buffer_data_available(struct iio_buffer *buffer);
>>   int iio_dma_buffer_set_bytes_per_datum(struct iio_buffer *buffer, 
>> size_t bpd);
>>   int iio_dma_buffer_set_length(struct iio_buffer *buffer, unsigned 
>> int length);
>>  @@ -142,4 +144,9 @@ int iio_dma_buffer_init(struct 
>> iio_dma_buffer_queue *queue,
>>   void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue);
>>   void iio_dma_buffer_release(struct iio_dma_buffer_queue *queue);
>> 
>>  +static inline size_t iio_dma_buffer_space_available(struct 
>> iio_buffer *buffer)
>>  +{
>>  +	return iio_dma_buffer_data_available(buffer);
>>  +}
>>  +
>>   #endif
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure
  2022-03-28 17:37   ` Jonathan Cameron
@ 2022-03-28 18:44     ` Paul Cercueil
  2022-03-29 13:36       ` Jonathan Cameron
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-03-28 18:44 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

Hi Jonathan,

Le lun., mars 28 2022 at 18:37:01 +0100, Jonathan Cameron 
<jic23@kernel.org> a écrit :
> On Mon,  7 Feb 2022 12:59:26 +0000
> Paul Cercueil <paul@crapouillou.net> wrote:
> 
>>  Add the necessary infrastructure to the IIO core to support a new
>>  optional DMABUF based interface.
>> 
>>  The advantage of this new DMABUF based interface vs. the read()
>>  interface, is that it avoids an extra copy of the data between the
>>  kernel and userspace. This is particularly userful for high-speed
> 
> useful
> 
>>  devices which produce several megabytes or even gigabytes of data 
>> per
>>  second.
>> 
>>  The data in this new DMABUF interface is managed at the granularity 
>> of
>>  DMABUF objects. Reducing the granularity from byte level to block 
>> level
>>  is done to reduce the userspace-kernelspace synchronization overhead
>>  since performing syscalls for each byte at a few Mbps is just not
>>  feasible.
>> 
>>  This of course leads to a slightly increased latency. For this 
>> reason an
>>  application can choose the size of the DMABUFs as well as how many 
>> it
>>  allocates. E.g. two DMABUFs would be a traditional double buffering
>>  scheme. But using a higher number might be necessary to avoid
>>  underflow/overflow situations in the presence of scheduling 
>> latencies.
>> 
>>  As part of the interface, 2 new IOCTLs have been added:
>> 
>>  IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *):
>>   Each call will allocate a new DMABUF object. The return value (if 
>> not
>>   a negative errno value as error) will be the file descriptor of 
>> the new
>>   DMABUF.
>> 
>>  IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
>>   Place the DMABUF object into the queue pending for hardware 
>> process.
>> 
>>  These two IOCTLs have to be performed on the IIO buffer's file
>>  descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
> 
> Just to check, do they work on the old deprecated chardev route? 
> Normally
> we can directly access the first buffer without the ioctl.

They do not. I think it's fine this way, since as you said, the old 
chardev route is deprecated. But I can add support for it with enough 
peer pressure.

>> 
>>  To access the data stored in a block by userspace the block must be
>>  mapped to the process's memory. This is done by calling mmap() on 
>> the
>>  DMABUF's file descriptor.
>> 
>>  Before accessing the data through the map, you must use the
>>  DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
>>  DMA_BUF_SYNC_START flag, to make sure that the data is available.
>>  This call may block until the hardware is done with this block. Once
>>  you are done reading or writing the data, you must use this ioctl 
>> again
>>  with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
>>  kernel's queue.
>> 
>>  If you need to know when the hardware is done with a DMABUF, you can
>>  poll its file descriptor for the EPOLLOUT event.
>> 
>>  Finally, to destroy a DMABUF object, simply call close() on its file
>>  descriptor.
>> 
>>  A typical workflow for the new interface is:
>> 
>>    for block in blocks:
>>      DMABUF_ALLOC block
>>      mmap block
>> 
>>    enable buffer
>> 
>>    while !done
>>      for block in blocks:
>>        DMABUF_ENQUEUE block
>> 
>>        DMABUF_SYNC_START block
>>        process data
>>        DMABUF_SYNC_END block
>> 
>>    disable buffer
>> 
>>    for block in blocks:
>>      close block
> 
> Given my very limited knowledge of dma-buf, I'll leave commenting
> on the flow to others who know if this looks 'standards' or not ;)
> 
> Code looks sane to me..

Thanks.

Cheers,
-Paul

>> 
>>  v2: Only allow the new IOCTLs on the buffer FD created with
>>      IIO_BUFFER_GET_FD_IOCTL().
>> 
>>  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
>>  ---
>>   drivers/iio/industrialio-buffer.c | 55 
>> +++++++++++++++++++++++++++++++
>>   include/linux/iio/buffer_impl.h   |  8 +++++
>>   include/uapi/linux/iio/buffer.h   | 29 ++++++++++++++++
>>   3 files changed, 92 insertions(+)
>> 
>>  diff --git a/drivers/iio/industrialio-buffer.c 
>> b/drivers/iio/industrialio-buffer.c
>>  index 94eb9f6cf128..72f333a519bc 100644
>>  --- a/drivers/iio/industrialio-buffer.c
>>  +++ b/drivers/iio/industrialio-buffer.c
>>  @@ -17,6 +17,7 @@
>>   #include <linux/fs.h>
>>   #include <linux/cdev.h>
>>   #include <linux/slab.h>
>>  +#include <linux/mm.h>
>>   #include <linux/poll.h>
>>   #include <linux/sched/signal.h>
>> 
>>  @@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct 
>> inode *inode, struct file *filep)
>>   	return 0;
>>   }
>> 
>>  +static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
>>  +				     struct iio_dmabuf __user *user_buf)
>>  +{
>>  +	struct iio_dmabuf dmabuf;
>>  +
>>  +	if (!buffer->access->enqueue_dmabuf)
>>  +		return -EPERM;
>>  +
>>  +	if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf)))
>>  +		return -EFAULT;
>>  +
>>  +	if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
>>  +		return -EINVAL;
>>  +
>>  +	return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
>>  +}
>>  +
>>  +static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer,
>>  +				   struct iio_dmabuf_alloc_req __user *user_req)
>>  +{
>>  +	struct iio_dmabuf_alloc_req req;
>>  +
>>  +	if (!buffer->access->alloc_dmabuf)
>>  +		return -EPERM;
>>  +
>>  +	if (copy_from_user(&req, user_req, sizeof(req)))
>>  +		return -EFAULT;
>>  +
>>  +	if (req.resv)
>>  +		return -EINVAL;
>>  +
>>  +	return buffer->access->alloc_dmabuf(buffer, &req);
>>  +}
>>  +
>>  +static long iio_buffer_chrdev_ioctl(struct file *filp,
>>  +				    unsigned int cmd, unsigned long arg)
>>  +{
>>  +	struct iio_dev_buffer_pair *ib = filp->private_data;
>>  +	struct iio_buffer *buffer = ib->buffer;
>>  +	void __user *_arg = (void __user *)arg;
>>  +
>>  +	switch (cmd) {
>>  +	case IIO_BUFFER_DMABUF_ALLOC_IOCTL:
>>  +		return iio_buffer_alloc_dmabuf(buffer, _arg);
>>  +	case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL:
>>  +		/* TODO: support non-blocking enqueue operation */
>>  +		return iio_buffer_enqueue_dmabuf(buffer, _arg);
>>  +	default:
>>  +		return IIO_IOCTL_UNHANDLED;
>>  +	}
>>  +}
>>  +
>>   static const struct file_operations iio_buffer_chrdev_fileops = {
>>   	.owner = THIS_MODULE,
>>   	.llseek = noop_llseek,
>>   	.read = iio_buffer_read,
>>   	.write = iio_buffer_write,
>>  +	.unlocked_ioctl = iio_buffer_chrdev_ioctl,
>>  +	.compat_ioctl = compat_ptr_ioctl,
>>   	.poll = iio_buffer_poll,
>>   	.release = iio_buffer_chrdev_release,
>>   };
>>  diff --git a/include/linux/iio/buffer_impl.h 
>> b/include/linux/iio/buffer_impl.h
>>  index e2ca8ea23e19..728541bc2c63 100644
>>  --- a/include/linux/iio/buffer_impl.h
>>  +++ b/include/linux/iio/buffer_impl.h
>>  @@ -39,6 +39,9 @@ struct iio_buffer;
>>    *                      device stops sampling. Calles are balanced 
>> with @enable.
>>    * @release:		called when the last reference to the buffer is 
>> dropped,
>>    *			should free all resources allocated by the buffer.
>>  + * @alloc_dmabuf:	called from userspace via ioctl to allocate one 
>> DMABUF.
>>  + * @enqueue_dmabuf:	called from userspace via ioctl to queue this 
>> DMABUF
>>  + *			object to this buffer. Requires a valid DMABUF fd.
>>    * @modes:		Supported operating modes by this buffer type
>>    * @flags:		A bitmask combination of INDIO_BUFFER_FLAG_*
>>    *
>>  @@ -68,6 +71,11 @@ struct iio_buffer_access_funcs {
>> 
>>   	void (*release)(struct iio_buffer *buffer);
>> 
>>  +	int (*alloc_dmabuf)(struct iio_buffer *buffer,
>>  +			    struct iio_dmabuf_alloc_req *req);
>>  +	int (*enqueue_dmabuf)(struct iio_buffer *buffer,
>>  +			      struct iio_dmabuf *block);
>>  +
>>   	unsigned int modes;
>>   	unsigned int flags;
>>   };
>>  diff --git a/include/uapi/linux/iio/buffer.h 
>> b/include/uapi/linux/iio/buffer.h
>>  index 13939032b3f6..e4621b926262 100644
>>  --- a/include/uapi/linux/iio/buffer.h
>>  +++ b/include/uapi/linux/iio/buffer.h
>>  @@ -5,6 +5,35 @@
>>   #ifndef _UAPI_IIO_BUFFER_H_
>>   #define _UAPI_IIO_BUFFER_H_
>> 
>>  +#include <linux/types.h>
>>  +
>>  +#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS	0x00000000
>>  +
>>  +/**
>>  + * struct iio_dmabuf_alloc_req - Descriptor for allocating IIO 
>> DMABUFs
>>  + * @size:	the size of a single DMABUF
>>  + * @resv:	reserved
>>  + */
>>  +struct iio_dmabuf_alloc_req {
>>  +	__u64 size;
>>  +	__u64 resv;
>>  +};
>>  +
>>  +/**
>>  + * struct iio_dmabuf - Descriptor for a single IIO DMABUF object
>>  + * @fd:		file descriptor of the DMABUF object
>>  + * @flags:	one or more IIO_BUFFER_DMABUF_* flags
>>  + * @bytes_used:	number of bytes used in this DMABUF for the data 
>> transfer.
>>  + *		If zero, the full buffer is used.
>>  + */
>>  +struct iio_dmabuf {
>>  +	__u32 fd;
>>  +	__u32 flags;
>>  +	__u64 bytes_used;
>>  +};
>>  +
>>   #define IIO_BUFFER_GET_FD_IOCTL			_IOWR('i', 0x91, int)
>>  +#define IIO_BUFFER_DMABUF_ALLOC_IOCTL		_IOW('i', 0x92, struct 
>> iio_dmabuf_alloc_req)
>>  +#define IIO_BUFFER_DMABUF_ENQUEUE_IOCTL		_IOW('i', 0x93, struct 
>> iio_dmabuf)
>> 
>>   #endif /* _UAPI_IIO_BUFFER_H_ */
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution
  2022-03-28 17:54   ` Jonathan Cameron
  2022-03-28 17:54     ` Christian König
@ 2022-03-28 19:16     ` Paul Cercueil
  2022-03-28 20:58       ` Andy Shevchenko
  1 sibling, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-03-28 19:16 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

Hi Jonathan,

Le lun., mars 28 2022 at 18:54:25 +0100, Jonathan Cameron 
<jic23@kernel.org> a écrit :
> On Mon,  7 Feb 2022 12:59:28 +0000
> Paul Cercueil <paul@crapouillou.net> wrote:
> 
>>  Enhance the current fileio code by using DMABUF objects instead of
>>  custom buffers.
>> 
>>  This adds more code than it removes, but:
>>  - a lot of the complexity can be dropped, e.g. custom kref and
>>    iio_buffer_block_put_atomic() are not needed anymore;
>>  - it will be much easier to introduce an API to export these DMABUF
>>    objects to userspace in a following patch.
>> 
>>  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> Hi Paul,
> 
> I'm a bit rusty on dma mappings, but you seem to have
> a mixture of streaming and coherent mappings going on in here.

That's OK, so am I. What do you call "streaming mappings"?

> Is it the case that the current code is using the coherent mappings
> and a potential 'other user' of the dma buffer might need
> streaming mappings?

Something like that. There are two different things; on both cases, 
userspace needs to create a DMABUF with IIO_BUFFER_DMABUF_ALLOC_IOCTL, 
and the backing memory is allocated with dma_alloc_coherent().

- For the userspace interface, you then have a "cpu access" IOCTL 
(DMA_BUF_IOCTL_SYNC), that allows userspace to inform when it will 
start/finish to process the buffer in user-space (which will 
sync/invalidate the data cache if needed). A buffer can then be 
enqueued for DMA processing (TX or RX) with the new 
IIO_BUFFER_DMABUF_ENQUEUE_IOCTL.

- When the DMABUF created via the IIO core is sent to another driver 
through the driver's custom DMABUF import function, this driver will 
call dma_buf_attach(), which will call iio_buffer_dma_buf_map(). Since 
it has to return a "struct sg_table *", this function then simply 
creates a sgtable with one entry that points to the backing memory.

Note that I added the iio_buffer_dma_buf_map() / _unmap() functions 
because the dma-buf core would WARN() if these were not provided. But 
since this code doesn't yet support importing/exporting DMABUFs to 
other drivers, these are never called, and I should probably just make 
them return a ERR_PTR() unconditionally.

Cheers,
-Paul

> Jonathan
> 
>>  ---
>>   drivers/iio/buffer/industrialio-buffer-dma.c | 192 
>> ++++++++++++-------
>>   include/linux/iio/buffer-dma.h               |   8 +-
>>   2 files changed, 122 insertions(+), 78 deletions(-)
>> 
>>  diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c 
>> b/drivers/iio/buffer/industrialio-buffer-dma.c
>>  index 15ea7bc3ac08..54e6000cd2ee 100644
>>  --- a/drivers/iio/buffer/industrialio-buffer-dma.c
>>  +++ b/drivers/iio/buffer/industrialio-buffer-dma.c
>>  @@ -14,6 +14,7 @@
>>   #include <linux/poll.h>
>>   #include <linux/iio/buffer_impl.h>
>>   #include <linux/iio/buffer-dma.h>
>>  +#include <linux/dma-buf.h>
>>   #include <linux/dma-mapping.h>
>>   #include <linux/sizes.h>
>> 
>>  @@ -90,103 +91,145 @@
>>    * callback is called from within the custom callback.
>>    */
>> 
>>  -static void iio_buffer_block_release(struct kref *kref)
>>  -{
>>  -	struct iio_dma_buffer_block *block = container_of(kref,
>>  -		struct iio_dma_buffer_block, kref);
>>  -
>>  -	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
>>  -
>>  -	dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size),
>>  -					block->vaddr, block->phys_addr);
>>  -
>>  -	iio_buffer_put(&block->queue->buffer);
>>  -	kfree(block);
>>  -}
>>  -
>>  -static void iio_buffer_block_get(struct iio_dma_buffer_block 
>> *block)
>>  -{
>>  -	kref_get(&block->kref);
>>  -}
>>  -
>>  -static void iio_buffer_block_put(struct iio_dma_buffer_block 
>> *block)
>>  -{
>>  -	kref_put(&block->kref, iio_buffer_block_release);
>>  -}
>>  -
>>  -/*
>>  - * dma_free_coherent can sleep, hence we need to take some special 
>> care to be
>>  - * able to drop a reference from an atomic context.
>>  - */
>>  -static LIST_HEAD(iio_dma_buffer_dead_blocks);
>>  -static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock);
>>  -
>>  -static void iio_dma_buffer_cleanup_worker(struct work_struct *work)
>>  -{
>>  -	struct iio_dma_buffer_block *block, *_block;
>>  -	LIST_HEAD(block_list);
>>  -
>>  -	spin_lock_irq(&iio_dma_buffer_dead_blocks_lock);
>>  -	list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list);
>>  -	spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock);
>>  -
>>  -	list_for_each_entry_safe(block, _block, &block_list, head)
>>  -		iio_buffer_block_release(&block->kref);
>>  -}
>>  -static DECLARE_WORK(iio_dma_buffer_cleanup_work, 
>> iio_dma_buffer_cleanup_worker);
>>  -
>>  -static void iio_buffer_block_release_atomic(struct kref *kref)
>>  -{
>>  +struct iio_buffer_dma_buf_attachment {
>>  +	struct scatterlist sgl;
>>  +	struct sg_table sg_table;
>>   	struct iio_dma_buffer_block *block;
>>  -	unsigned long flags;
>>  -
>>  -	block = container_of(kref, struct iio_dma_buffer_block, kref);
>>  -
>>  -	spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags);
>>  -	list_add_tail(&block->head, &iio_dma_buffer_dead_blocks);
>>  -	spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags);
>>  -
>>  -	schedule_work(&iio_dma_buffer_cleanup_work);
>>  -}
>>  -
>>  -/*
>>  - * Version of iio_buffer_block_put() that can be called from 
>> atomic context
>>  - */
>>  -static void iio_buffer_block_put_atomic(struct 
>> iio_dma_buffer_block *block)
>>  -{
>>  -	kref_put(&block->kref, iio_buffer_block_release_atomic);
>>  -}
>>  +};
>> 
>>   static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct 
>> iio_buffer *buf)
>>   {
>>   	return container_of(buf, struct iio_dma_buffer_queue, buffer);
>>   }
>> 
>>  +static struct iio_buffer_dma_buf_attachment *
>>  +to_iio_buffer_dma_buf_attachment(struct sg_table *table)
>>  +{
>>  +	return container_of(table, struct iio_buffer_dma_buf_attachment, 
>> sg_table);
>>  +}
>>  +
>>  +static void iio_buffer_block_get(struct iio_dma_buffer_block 
>> *block)
>>  +{
>>  +	get_dma_buf(block->dmabuf);
>>  +}
>>  +
>>  +static void iio_buffer_block_put(struct iio_dma_buffer_block 
>> *block)
>>  +{
>>  +	dma_buf_put(block->dmabuf);
>>  +}
>>  +
>>  +static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
>>  +				     struct dma_buf_attachment *at)
>>  +{
>>  +	at->priv = dbuf->priv;
>>  +
>>  +	return 0;
>>  +}
>>  +
>>  +static struct sg_table *iio_buffer_dma_buf_map(struct 
>> dma_buf_attachment *at,
>>  +					       enum dma_data_direction dma_dir)
>>  +{
>>  +	struct iio_dma_buffer_block *block = at->priv;
>>  +	struct iio_buffer_dma_buf_attachment *dba;
>>  +	int ret;
>>  +
>>  +	dba = kzalloc(sizeof(*dba), GFP_KERNEL);
>>  +	if (!dba)
>>  +		return ERR_PTR(-ENOMEM);
>>  +
>>  +	sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size));
>>  +	dba->sg_table.sgl = &dba->sgl;
>>  +	dba->sg_table.nents = 1;
>>  +	dba->block = block;
>>  +
>>  +	ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
>>  +	if (ret) {
>>  +		kfree(dba);
>>  +		return ERR_PTR(ret);
>>  +	}
>>  +
>>  +	return &dba->sg_table;
>>  +}
>>  +
>>  +static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
>>  +				     struct sg_table *sg_table,
>>  +				     enum dma_data_direction dma_dir)
>>  +{
>>  +	struct iio_buffer_dma_buf_attachment *dba =
>>  +		to_iio_buffer_dma_buf_attachment(sg_table);
>>  +
>>  +	dma_unmap_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
>>  +	kfree(dba);
>>  +}
>>  +
>>  +static void iio_buffer_dma_buf_release(struct dma_buf *dbuf)
>>  +{
>>  +	struct iio_dma_buffer_block *block = dbuf->priv;
>>  +	struct iio_dma_buffer_queue *queue = block->queue;
>>  +
>>  +	WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
>>  +
>>  +	mutex_lock(&queue->lock);
>>  +
>>  +	dma_free_coherent(queue->dev, PAGE_ALIGN(block->size),
>>  +			  block->vaddr, block->phys_addr);
>>  +	kfree(block);
>>  +
>>  +	mutex_unlock(&queue->lock);
>>  +	iio_buffer_put(&queue->buffer);
>>  +}
>>  +
>>  +static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = {
>>  +	.attach			= iio_buffer_dma_buf_attach,
>>  +	.map_dma_buf		= iio_buffer_dma_buf_map,
>>  +	.unmap_dma_buf		= iio_buffer_dma_buf_unmap,
>>  +	.release		= iio_buffer_dma_buf_release,
>>  +};
>>  +
>>   static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
>>   	struct iio_dma_buffer_queue *queue, size_t size)
>>   {
>>   	struct iio_dma_buffer_block *block;
>>  +	DEFINE_DMA_BUF_EXPORT_INFO(einfo);
>>  +	struct dma_buf *dmabuf;
>>  +	int err = -ENOMEM;
>> 
>>   	block = kzalloc(sizeof(*block), GFP_KERNEL);
>>   	if (!block)
>>  -		return NULL;
>>  +		return ERR_PTR(err);
>> 
>>   	block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size),
>>   		&block->phys_addr, GFP_KERNEL);
>>  -	if (!block->vaddr) {
>>  -		kfree(block);
>>  -		return NULL;
>>  +	if (!block->vaddr)
>>  +		goto err_free_block;
>>  +
>>  +	einfo.ops = &iio_dma_buffer_dmabuf_ops;
>>  +	einfo.size = PAGE_ALIGN(size);
>>  +	einfo.priv = block;
>>  +	einfo.flags = O_RDWR;
>>  +
>>  +	dmabuf = dma_buf_export(&einfo);
>>  +	if (IS_ERR(dmabuf)) {
>>  +		err = PTR_ERR(dmabuf);
>>  +		goto err_free_dma;
>>   	}
>> 
>>  +	block->dmabuf = dmabuf;
>>   	block->size = size;
>>   	block->state = IIO_BLOCK_STATE_DONE;
>>   	block->queue = queue;
>>   	INIT_LIST_HEAD(&block->head);
>>  -	kref_init(&block->kref);
>> 
>>   	iio_buffer_get(&queue->buffer);
>> 
>>   	return block;
>>  +
>>  +err_free_dma:
>>  +	dma_free_coherent(queue->dev, PAGE_ALIGN(size),
>>  +			  block->vaddr, block->phys_addr);
>>  +err_free_block:
>>  +	kfree(block);
>>  +	return ERR_PTR(err);
>>   }
>> 
>>   static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block 
>> *block)
>>  @@ -223,7 +266,7 @@ void iio_dma_buffer_block_done(struct 
>> iio_dma_buffer_block *block)
>>   	_iio_dma_buffer_block_done(block);
>>   	spin_unlock_irqrestore(&queue->list_lock, flags);
>> 
>>  -	iio_buffer_block_put_atomic(block);
>>  +	iio_buffer_block_put(block);
>>   	iio_dma_buffer_queue_wake(queue);
>>   }
>>   EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
>>  @@ -249,7 +292,8 @@ void iio_dma_buffer_block_list_abort(struct 
>> iio_dma_buffer_queue *queue,
>>   		list_del(&block->head);
>>   		block->bytes_used = 0;
>>   		_iio_dma_buffer_block_done(block);
>>  -		iio_buffer_block_put_atomic(block);
>>  +
>>  +		iio_buffer_block_put(block);
>>   	}
>>   	spin_unlock_irqrestore(&queue->list_lock, flags);
>> 
>>  @@ -340,8 +384,8 @@ int iio_dma_buffer_request_update(struct 
>> iio_buffer *buffer)
>> 
>>   		if (!block) {
>>   			block = iio_dma_buffer_alloc_block(queue, size);
>>  -			if (!block) {
>>  -				ret = -ENOMEM;
>>  +			if (IS_ERR(block)) {
>>  +				ret = PTR_ERR(block);
>>   				goto out_unlock;
>>   			}
>>   			queue->fileio.blocks[i] = block;
>>  diff --git a/include/linux/iio/buffer-dma.h 
>> b/include/linux/iio/buffer-dma.h
>>  index 490b93f76fa8..6b3fa7d2124b 100644
>>  --- a/include/linux/iio/buffer-dma.h
>>  +++ b/include/linux/iio/buffer-dma.h
>>  @@ -8,7 +8,6 @@
>>   #define __INDUSTRIALIO_DMA_BUFFER_H__
>> 
>>   #include <linux/list.h>
>>  -#include <linux/kref.h>
>>   #include <linux/spinlock.h>
>>   #include <linux/mutex.h>
>>   #include <linux/iio/buffer_impl.h>
>>  @@ -16,6 +15,7 @@
>>   struct iio_dma_buffer_queue;
>>   struct iio_dma_buffer_ops;
>>   struct device;
>>  +struct dma_buf;
>> 
>>   /**
>>    * enum iio_block_state - State of a struct iio_dma_buffer_block
>>  @@ -39,8 +39,8 @@ enum iio_block_state {
>>    * @vaddr: Virutal address of the blocks memory
>>    * @phys_addr: Physical address of the blocks memory
>>    * @queue: Parent DMA buffer queue
>>  - * @kref: kref used to manage the lifetime of block
>>    * @state: Current state of the block
>>  + * @dmabuf: Underlying DMABUF object
>>    */
>>   struct iio_dma_buffer_block {
>>   	/* May only be accessed by the owner of the block */
>>  @@ -56,13 +56,13 @@ struct iio_dma_buffer_block {
>>   	size_t size;
>>   	struct iio_dma_buffer_queue *queue;
>> 
>>  -	/* Must not be accessed outside the core. */
>>  -	struct kref kref;
>>   	/*
>>   	 * Must not be accessed outside the core. Access needs to hold
>>   	 * queue->list_lock if the block is not owned by the core.
>>   	 */
>>   	enum iio_block_state state;
>>  +
>>  +	struct dma_buf *dmabuf;
>>   };
>> 
>>   /**
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support
  2022-02-07 12:59 ` [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support Paul Cercueil
  2022-03-28 17:24   ` Jonathan Cameron
@ 2022-03-28 20:38   ` Andy Shevchenko
  1 sibling, 0 replies; 41+ messages in thread
From: Andy Shevchenko @ 2022-03-28 20:38 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Jonathan Cameron, Michael Hennerich, Lars-Peter Clausen,
	Christian König, Sumit Semwal, Jonathan Corbet,
	Alexandru Ardelean, dri-devel, linaro-mm-sig,
	Linux Documentation List, Linux Kernel Mailing List, linux-iio

On Wed, Feb 9, 2022 at 9:10 AM Paul Cercueil <paul@crapouillou.net> wrote:
>
> Adding write support to the buffer-dma code is easy - the write()
> function basically needs to do the exact same thing as the read()
> function: dequeue a block, read or write the data, enqueue the block
> when entirely processed.
>
> Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write()
> now both call a function iio_buffer_dma_io(), which will perform this
> task.
>
> The .space_available() callback can return the exact same value as the
> .data_available() callback for input buffers, since in both cases we
> count the exact same thing (the number of bytes in each available
> block).
>
> Note that we preemptively reset block->bytes_used to the buffer's size
> in iio_dma_buffer_request_update(), as in the future the
> iio_dma_buffer_enqueue() function won't reset it.

...

> v2: - Fix block->state not being reset in
>       iio_dma_buffer_request_update() for output buffers.
>     - Only update block->bytes_used once and add a comment about why we
>       update it.
>     - Add a comment about why we're setting a different state for output
>       buffers in iio_dma_buffer_request_update()
>     - Remove useless cast to bool (!!) in iio_dma_buffer_io()

Usual place for changelog is after the cutter '--- ' line below...

> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
> ---

...somewhere here.

>  drivers/iio/buffer/industrialio-buffer-dma.c | 88 ++++++++++++++++----
>  include/linux/iio/buffer-dma.h               |  7 ++

...

> +static int iio_dma_buffer_io(struct iio_buffer *buffer,
> +                            size_t n, char __user *user_buffer, bool is_write)

I believe there is a room for size_t n on the previous line.

...

> +       if (is_write)

I would name it is_from_user.

> +               ret = copy_from_user(addr, user_buffer, n);
> +       else
> +               ret = copy_to_user(user_buffer, addr, n);

...

> +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
> +                        const char __user *user_buffer)
> +{
> +       return iio_dma_buffer_io(buffer, n, (__force char *)user_buffer, true);

Why do you drop address space markers?

> +}

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure
  2022-02-07 12:59 ` [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure Paul Cercueil
  2022-03-28 17:37   ` Jonathan Cameron
@ 2022-03-28 20:46   ` Andy Shevchenko
  1 sibling, 0 replies; 41+ messages in thread
From: Andy Shevchenko @ 2022-03-28 20:46 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Jonathan Cameron, Michael Hennerich, Lars-Peter Clausen,
	Christian König, Sumit Semwal, Jonathan Corbet,
	Alexandru Ardelean, dri-devel, linaro-mm-sig,
	Linux Documentation List, Linux Kernel Mailing List, linux-iio

On Tue, Feb 8, 2022 at 5:26 PM Paul Cercueil <paul@crapouillou.net> wrote:
>
> Add the necessary infrastructure to the IIO core to support a new
> optional DMABUF based interface.
>
> The advantage of this new DMABUF based interface vs. the read()
> interface, is that it avoids an extra copy of the data between the
> kernel and userspace. This is particularly userful for high-speed

useful

> devices which produce several megabytes or even gigabytes of data per
> second.
>
> The data in this new DMABUF interface is managed at the granularity of
> DMABUF objects. Reducing the granularity from byte level to block level
> is done to reduce the userspace-kernelspace synchronization overhead
> since performing syscalls for each byte at a few Mbps is just not
> feasible.
>
> This of course leads to a slightly increased latency. For this reason an
> application can choose the size of the DMABUFs as well as how many it
> allocates. E.g. two DMABUFs would be a traditional double buffering
> scheme. But using a higher number might be necessary to avoid
> underflow/overflow situations in the presence of scheduling latencies.
>
> As part of the interface, 2 new IOCTLs have been added:
>
> IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *):
>  Each call will allocate a new DMABUF object. The return value (if not
>  a negative errno value as error) will be the file descriptor of the new
>  DMABUF.
>
> IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
>  Place the DMABUF object into the queue pending for hardware process.
>
> These two IOCTLs have to be performed on the IIO buffer's file
> descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
>
> To access the data stored in a block by userspace the block must be
> mapped to the process's memory. This is done by calling mmap() on the
> DMABUF's file descriptor.
>
> Before accessing the data through the map, you must use the
> DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
> DMA_BUF_SYNC_START flag, to make sure that the data is available.
> This call may block until the hardware is done with this block. Once
> you are done reading or writing the data, you must use this ioctl again
> with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
> kernel's queue.
>
> If you need to know when the hardware is done with a DMABUF, you can
> poll its file descriptor for the EPOLLOUT event.
>
> Finally, to destroy a DMABUF object, simply call close() on its file
> descriptor.

...

> v2: Only allow the new IOCTLs on the buffer FD created with
>     IIO_BUFFER_GET_FD_IOCTL().

Move changelogs after the cutter '--- ' line.

...

>  static const struct file_operations iio_buffer_chrdev_fileops = {
>         .owner = THIS_MODULE,
>         .llseek = noop_llseek,
>         .read = iio_buffer_read,
>         .write = iio_buffer_write,
> +       .unlocked_ioctl = iio_buffer_chrdev_ioctl,

> +       .compat_ioctl = compat_ptr_ioctl,

Is this member always available (implying the kernel configuration)?

...

> +#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS      0x00000000

No flags available right now?

...

> + * @bytes_used:        number of bytes used in this DMABUF for the data transfer.
> + *             If zero, the full buffer is used.

Wouldn't be error prone to have 0 defined like this?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution
  2022-03-28 19:16     ` Paul Cercueil
@ 2022-03-28 20:58       ` Andy Shevchenko
  0 siblings, 0 replies; 41+ messages in thread
From: Andy Shevchenko @ 2022-03-28 20:58 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Jonathan Cameron, Michael Hennerich, Lars-Peter Clausen,
	Christian König, Sumit Semwal, Jonathan Corbet,
	Alexandru Ardelean, dri-devel, linaro-mm-sig,
	Linux Documentation List, Linux Kernel Mailing List, linux-iio

On Mon, Mar 28, 2022 at 11:30 PM Paul Cercueil <paul@crapouillou.net> wrote:
> Le lun., mars 28 2022 at 18:54:25 +0100, Jonathan Cameron
> <jic23@kernel.org> a écrit :
> > On Mon,  7 Feb 2022 12:59:28 +0000
> > Paul Cercueil <paul@crapouillou.net> wrote:
> >
> >>  Enhance the current fileio code by using DMABUF objects instead of
> >>  custom buffers.
> >>
> >>  This adds more code than it removes, but:
> >>  - a lot of the complexity can be dropped, e.g. custom kref and
> >>    iio_buffer_block_put_atomic() are not needed anymore;
> >>  - it will be much easier to introduce an API to export these DMABUF
> >>    objects to userspace in a following patch.

> > I'm a bit rusty on dma mappings, but you seem to have
> > a mixture of streaming and coherent mappings going on in here.
>
> That's OK, so am I. What do you call "streaming mappings"?

dma_*_coherent() are for coherent mappings (usually you do it once and
cache coherency is guaranteed by accessing this memory by device or
CPU).
dma_map_*() are for streaming, which means that you often want to map
arbitrary pages during the transfer (usually used for the cases when
you want to keep previous data and do something with a new coming, or
when a new coming data is supplied by different virtual address, and
hence has to be mapped for DMA).

> > Is it the case that the current code is using the coherent mappings
> > and a potential 'other user' of the dma buffer might need
> > streaming mappings?
>
> Something like that. There are two different things; on both cases,
> userspace needs to create a DMABUF with IIO_BUFFER_DMABUF_ALLOC_IOCTL,
> and the backing memory is allocated with dma_alloc_coherent().
>
> - For the userspace interface, you then have a "cpu access" IOCTL
> (DMA_BUF_IOCTL_SYNC), that allows userspace to inform when it will
> start/finish to process the buffer in user-space (which will
> sync/invalidate the data cache if needed). A buffer can then be
> enqueued for DMA processing (TX or RX) with the new
> IIO_BUFFER_DMABUF_ENQUEUE_IOCTL.
>
> - When the DMABUF created via the IIO core is sent to another driver
> through the driver's custom DMABUF import function, this driver will
> call dma_buf_attach(), which will call iio_buffer_dma_buf_map(). Since
> it has to return a "struct sg_table *", this function then simply
> creates a sgtable with one entry that points to the backing memory.

...

> >>  +   ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
> >>  +   if (ret) {
> >>  +           kfree(dba);
> >>  +           return ERR_PTR(ret);
> >>  +   }

Missed DMA mapping error check.

> >>  +
> >>  +   return &dba->sg_table;
> >>  +}

...

> >>  -   /* Must not be accessed outside the core. */
> >>  -   struct kref kref;


> >>  +   struct dma_buf *dmabuf;

Is it okay to access outside the core? If no, why did you remove
(actually not modify) the comment?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support
  2022-03-28 18:39     ` Paul Cercueil
@ 2022-03-29  7:11       ` Nuno Sá
  0 siblings, 0 replies; 41+ messages in thread
From: Nuno Sá @ 2022-03-29  7:11 UTC (permalink / raw)
  To: Paul Cercueil, Jonathan Cameron
  Cc: Michael Hennerich, Lars-Peter Clausen, Christian König,
	Sumit Semwal, Jonathan Corbet, Alexandru Ardelean, dri-devel,
	linaro-mm-sig, linux-doc, linux-kernel, linux-iio

On Mon, 2022-03-28 at 19:39 +0100, Paul Cercueil wrote:
> Hi Jonathan,
> 
> Le lun., mars 28 2022 at 18:24:09 +0100, Jonathan Cameron 
> <jic23@kernel.org> a écrit :
> > On Mon,  7 Feb 2022 12:59:23 +0000
> > Paul Cercueil <paul@crapouillou.net> wrote:
> > 
> > >  Adding write support to the buffer-dma code is easy - the
> > > write()
> > >  function basically needs to do the exact same thing as the
> > > read()
> > >  function: dequeue a block, read or write the data, enqueue the
> > > block
> > >  when entirely processed.
> > > 
> > >  Therefore, the iio_buffer_dma_read() and the new 
> > > iio_buffer_dma_write()
> > >  now both call a function iio_buffer_dma_io(), which will perform
> > > this
> > >  task.
> > > 
> > >  The .space_available() callback can return the exact same value
> > > as 
> > > the
> > >  .data_available() callback for input buffers, since in both
> > > cases we
> > >  count the exact same thing (the number of bytes in each
> > > available
> > >  block).
> > > 
> > >  Note that we preemptively reset block->bytes_used to the
> > > buffer's 
> > > size
> > >  in iio_dma_buffer_request_update(), as in the future the
> > >  iio_dma_buffer_enqueue() function won't reset it.
> > > 
> > >  v2: - Fix block->state not being reset in
> > >        iio_dma_buffer_request_update() for output buffers.
> > >      - Only update block->bytes_used once and add a comment about
> > > why we
> > >        update it.
> > >      - Add a comment about why we're setting a different state
> > > for 
> > > output
> > >        buffers in iio_dma_buffer_request_update()
> > >      - Remove useless cast to bool (!!) in iio_dma_buffer_io()
> > > 
> > >  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > >  Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
> > One comment inline.
> > 
> > I'd be tempted to queue this up with that fixed, but do we have
> > any users?  Even though it's trivial I'm not that keen on code
> > upstream well in advance of it being used.
> 
> There's a userspace user in libiio. On the kernel side we do have 
> drivers that use it in ADI's downstream kernel, that we plan to 
> upstream in the long term (but it can take some time, as we need to 
> upstream other things first, like JESD204B support).
> 
> 

You mean, users for DMA output buffers? If so, I have on my queue to
add the dac counterpart of this one:

https://elixir.bootlin.com/linux/latest/source/drivers/iio/adc/adi-axi-adc.c

Which is a user of DMA buffers. Though this one does not depend on
JESD204, I suspect it will also be a tricky process mainly because I
think there are major issues on how things are done right now (on the
ADC driver).

But yeah, not a topic here and I do plan to first start the discussion
on the mailing list before starting developing (hopefully in the coming
weeks)...

- Nuno Sá



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
  2022-02-15 17:43   ` Paul Cercueil
@ 2022-03-29  8:33     ` Daniel Vetter
  2022-03-29  9:11       ` Paul Cercueil
  0 siblings, 1 reply; 41+ messages in thread
From: Daniel Vetter @ 2022-03-29  8:33 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Jonathan Cameron, Michael Hennerich, Jonathan Corbet, linux-iio,
	linux-doc, linux-kernel, dri-devel, Sumit Semwal, linaro-mm-sig,
	Alexandru Ardelean, Christian König

On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
> Hi Jonathan,
> 
> Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron
> <jic23@kernel.org> a écrit :
> > On Mon,  7 Feb 2022 12:59:21 +0000
> > Paul Cercueil <paul@crapouillou.net> wrote:
> > 
> > >  Hi Jonathan,
> > > 
> > >  This is the V2 of my patchset that introduces a new userspace
> > > interface
> > >  based on DMABUF objects to complement the fileio API, and adds
> > > write()
> > >  support to the existing fileio API.
> > 
> > Hi Paul,
> > 
> > It's been a little while. Perhaps you could summarize the various view
> > points around the appropriateness of using DMABUF for this?
> > I appreciate it is a tricky topic to distil into a brief summary but
> > I know I would find it useful even if no one else does!
> 
> So we want to have a high-speed interface where buffers of samples are
> passed around between IIO devices and other devices (e.g. USB or network),
> or made available to userspace without copying the data.
> 
> DMABUF is, at least in theory, exactly what we need. Quoting the
> documentation
> (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
> "The dma-buf subsystem provides the framework for sharing buffers for
> hardware (DMA) access across multiple device drivers and subsystems, and for
> synchronizing asynchronous hardware access. This is used, for example, by
> drm “prime” multi-GPU support, but is of course not limited to GPU use
> cases."
> 
> The problem is that right now DMABUF is only really used by DRM, and to
> quote Daniel, "dma-buf looks like something super generic and useful, until
> you realize that there's a metric ton of gpu/accelerator bagage piled in".
> 
> Still, it seems to be the only viable option. We could add a custom
> buffer-passing interface, but that would mean implementing the same
> buffer-passing interface on the network and USB stacks, and before we know
> it we re-invented DMABUFs.

dma-buf also doesn't support sharing with network and usb stacks, so I'm a
bit confused why exactly this is useful?

So yeah unless there's some sharing going on with gpu stuff (for data
processing maybe) I'm not sure this makes a lot of sense really. Or at
least some zero-copy sharing between drivers, but even that would
minimally require a dma-buf import ioctl of some sorts. Which I either
missed or doesn't exist.

If there's none of that then just hand-roll your buffer handling code
(xarray is cheap to use in terms of code for this), you can always add
dma-buf import/export later on when the need arises.

Scrolling through patches you only have dma-buf export, but no importing,
so the use-case that works is with one of the existing subsystems that
supporting dma-buf importing.

I think minimally we need the use-case (in form of code) that needs the
buffer sharing here.
-Daniel

> Cheers,
> -Paul
> 
> 
> > > 
> > >  Changes since v1:
> > > 
> > >  - the patches that were merged in v1 have been (obviously) dropped
> > > from
> > >    this patchset;
> > >  - the patch that was setting the write-combine cache setting has
> > > been
> > >    dropped as well, as it was simply not useful.
> > >  - [01/12]:
> > >      * Only remove the outgoing queue, and keep the incoming queue,
> > > as we
> > >        want the buffer to start streaming data as soon as it is
> > > enabled.
> > >      * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally
> > > the
> > >        same as IIO_BLOCK_STATE_DONE.
> > >  - [02/12]:
> > >      * Fix block->state not being reset in
> > >        iio_dma_buffer_request_update() for output buffers.
> > >      * Only update block->bytes_used once and add a comment about
> > > why we
> > >        update it.
> > >      * Add a comment about why we're setting a different state for
> > > output
> > >        buffers in iio_dma_buffer_request_update()
> > >      * Remove useless cast to bool (!!) in iio_dma_buffer_io()
> > >  - [05/12]:
> > >      Only allow the new IOCTLs on the buffer FD created with
> > >      IIO_BUFFER_GET_FD_IOCTL().
> > >  - [12/12]:
> > >      * Explicitly state that the new interface is optional and is
> > >        not implemented by all drivers.
> > >      * The IOCTLs can now only be called on the buffer FD returned by
> > >        IIO_BUFFER_GET_FD_IOCTL.
> > >      * Move the page up a bit in the index since it is core stuff
> > > and not
> > >        driver-specific.
> > > 
> > >  The patches not listed here have not been modified since v1.
> > > 
> > >  Cheers,
> > >  -Paul
> > > 
> > >  Alexandru Ardelean (1):
> > >    iio: buffer-dma: split iio_dma_buffer_fileio_free() function
> > > 
> > >  Paul Cercueil (11):
> > >    iio: buffer-dma: Get rid of outgoing queue
> > >    iio: buffer-dma: Enable buffer write support
> > >    iio: buffer-dmaengine: Support specifying buffer direction
> > >    iio: buffer-dmaengine: Enable write support
> > >    iio: core: Add new DMABUF interface infrastructure
> > >    iio: buffer-dma: Use DMABUFs instead of custom solution
> > >    iio: buffer-dma: Implement new DMABUF based userspace API
> > >    iio: buffer-dmaengine: Support new DMABUF based userspace API
> > >    iio: core: Add support for cyclic buffers
> > >    iio: buffer-dmaengine: Add support for cyclic buffers
> > >    Documentation: iio: Document high-speed DMABUF based API
> > > 
> > >   Documentation/driver-api/dma-buf.rst          |   2 +
> > >   Documentation/iio/dmabuf_api.rst              |  94 +++
> > >   Documentation/iio/index.rst                   |   2 +
> > >   drivers/iio/adc/adi-axi-adc.c                 |   3 +-
> > >   drivers/iio/buffer/industrialio-buffer-dma.c  | 610
> > > ++++++++++++++----
> > >   .../buffer/industrialio-buffer-dmaengine.c    |  42 +-
> > >   drivers/iio/industrialio-buffer.c             |  60 ++
> > >   include/linux/iio/buffer-dma.h                |  38 +-
> > >   include/linux/iio/buffer-dmaengine.h          |   5 +-
> > >   include/linux/iio/buffer_impl.h               |   8 +
> > >   include/uapi/linux/iio/buffer.h               |  30 +
> > >   11 files changed, 749 insertions(+), 145 deletions(-)
> > >   create mode 100644 Documentation/iio/dmabuf_api.rst
> > > 
> > 
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API
  2022-02-07 13:01   ` [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API Paul Cercueil
@ 2022-03-29  8:54     ` Daniel Vetter
  2022-03-29  9:47       ` Paul Cercueil
  0 siblings, 1 reply; 41+ messages in thread
From: Daniel Vetter @ 2022-03-29  8:54 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Jonathan Cameron, Michael Hennerich, Jonathan Corbet, linux-iio,
	linux-doc, linux-kernel, dri-devel, linaro-mm-sig,
	Alexandru Ardelean, Christian König

On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
> Document the new DMABUF based API.
> 
> v2: - Explicitly state that the new interface is optional and is
>       not implemented by all drivers.
>     - The IOCTLs can now only be called on the buffer FD returned by
>       IIO_BUFFER_GET_FD_IOCTL.
>     - Move the page up a bit in the index since it is core stuff and not
>       driver-specific.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> ---
>  Documentation/driver-api/dma-buf.rst |  2 +
>  Documentation/iio/dmabuf_api.rst     | 94 ++++++++++++++++++++++++++++
>  Documentation/iio/index.rst          |  2 +
>  3 files changed, 98 insertions(+)
>  create mode 100644 Documentation/iio/dmabuf_api.rst
> 
> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> index 2cd7db82d9fe..d3c9b58d2706 100644
> --- a/Documentation/driver-api/dma-buf.rst
> +++ b/Documentation/driver-api/dma-buf.rst
> @@ -1,3 +1,5 @@
> +.. _dma-buf:
> +
>  Buffer Sharing and Synchronization
>  ==================================
>  
> diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst
> new file mode 100644
> index 000000000000..43bb2c1b9fdc
> --- /dev/null
> +++ b/Documentation/iio/dmabuf_api.rst
> @@ -0,0 +1,94 @@
> +===================================
> +High-speed DMABUF interface for IIO
> +===================================
> +
> +1. Overview
> +===========
> +
> +The Industrial I/O subsystem supports access to buffers through a file-based
> +interface, with read() and write() access calls through the IIO device's dev
> +node.
> +
> +It additionally supports a DMABUF based interface, where the userspace
> +application can allocate and append DMABUF objects to the buffer's queue.
> +This interface is however optional and is not available in all drivers.
> +
> +The advantage of this DMABUF based interface vs. the read()
> +interface, is that it avoids an extra copy of the data between the
> +kernel and userspace. This is particularly useful for high-speed
> +devices which produce several megabytes or even gigabytes of data per
> +second.
> +
> +The data in this DMABUF interface is managed at the granularity of
> +DMABUF objects. Reducing the granularity from byte level to block level
> +is done to reduce the userspace-kernelspace synchronization overhead
> +since performing syscalls for each byte at a few Mbps is just not
> +feasible.
> +
> +This of course leads to a slightly increased latency. For this reason an
> +application can choose the size of the DMABUFs as well as how many it
> +allocates. E.g. two DMABUFs would be a traditional double buffering
> +scheme. But using a higher number might be necessary to avoid
> +underflow/overflow situations in the presence of scheduling latencies.

So this reads a lot like reinventing io-uring with pre-registered O_DIRECT
memory ranges. Except it's using dma-buf and hand-rolling a lot of pieces
instead of io-uring and O_DIRECT.

At least if the entire justification for dma-buf support is zero-copy
support between the driver and userspace it's _really_ not the right tool
for the job. dma-buf is for zero-copy between devices, with cpu access
from userpace (or kernel fwiw) being very much the exception (and often
flat-out not supported at all).
-Daniel

> +
> +2. User API
> +===========
> +
> +``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)``
> +----------------------------------------------------------------
> +
> +Each call will allocate a new DMABUF object. The return value (if not
> +a negative errno value as error) will be the file descriptor of the new
> +DMABUF.
> +
> +``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)``
> +--------------------------------------------------------
> +
> +Place the DMABUF object into the queue pending for hardware process.
> +
> +These two IOCTLs have to be performed on the IIO buffer's file
> +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl.
> +
> +3. Usage
> +========
> +
> +To access the data stored in a block by userspace the block must be
> +mapped to the process's memory. This is done by calling mmap() on the
> +DMABUF's file descriptor.
> +
> +Before accessing the data through the map, you must use the
> +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
> +DMA_BUF_SYNC_START flag, to make sure that the data is available.
> +This call may block until the hardware is done with this block. Once
> +you are done reading or writing the data, you must use this ioctl again
> +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
> +kernel's queue.
> +
> +If you need to know when the hardware is done with a DMABUF, you can
> +poll its file descriptor for the EPOLLOUT event.
> +
> +Finally, to destroy a DMABUF object, simply call close() on its file
> +descriptor.
> +
> +For more information about manipulating DMABUF objects, see: :ref:`dma-buf`.
> +
> +A typical workflow for the new interface is:
> +
> +    for block in blocks:
> +      DMABUF_ALLOC block
> +      mmap block
> +
> +    enable buffer
> +
> +    while !done
> +      for block in blocks:
> +        DMABUF_ENQUEUE block
> +
> +        DMABUF_SYNC_START block
> +        process data
> +        DMABUF_SYNC_END block
> +
> +    disable buffer
> +
> +    for block in blocks:
> +      close block
> diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst
> index 58b7a4ebac51..669deb67ddee 100644
> --- a/Documentation/iio/index.rst
> +++ b/Documentation/iio/index.rst
> @@ -9,4 +9,6 @@ Industrial I/O
>  
>     iio_configfs
>  
> +   dmabuf_api
> +
>     ep93xx_adc
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
  2022-03-29  8:33     ` Daniel Vetter
@ 2022-03-29  9:11       ` Paul Cercueil
  2022-03-29 14:10         ` Daniel Vetter
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-03-29  9:11 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jonathan Cameron, Michael Hennerich, Jonathan Corbet, linux-iio,
	linux-doc, linux-kernel, dri-devel, Sumit Semwal, linaro-mm-sig,
	Alexandru Ardelean, Christian König

Hi Daniel,

Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter 
<daniel@ffwll.ch> a écrit :
> On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
>>  Hi Jonathan,
>> 
>>  Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron
>>  <jic23@kernel.org> a écrit :
>>  > On Mon,  7 Feb 2022 12:59:21 +0000
>>  > Paul Cercueil <paul@crapouillou.net> wrote:
>>  >
>>  > >  Hi Jonathan,
>>  > >
>>  > >  This is the V2 of my patchset that introduces a new userspace
>>  > > interface
>>  > >  based on DMABUF objects to complement the fileio API, and adds
>>  > > write()
>>  > >  support to the existing fileio API.
>>  >
>>  > Hi Paul,
>>  >
>>  > It's been a little while. Perhaps you could summarize the various 
>> view
>>  > points around the appropriateness of using DMABUF for this?
>>  > I appreciate it is a tricky topic to distil into a brief summary 
>> but
>>  > I know I would find it useful even if no one else does!
>> 
>>  So we want to have a high-speed interface where buffers of samples 
>> are
>>  passed around between IIO devices and other devices (e.g. USB or 
>> network),
>>  or made available to userspace without copying the data.
>> 
>>  DMABUF is, at least in theory, exactly what we need. Quoting the
>>  documentation
>>  (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
>>  "The dma-buf subsystem provides the framework for sharing buffers 
>> for
>>  hardware (DMA) access across multiple device drivers and 
>> subsystems, and for
>>  synchronizing asynchronous hardware access. This is used, for 
>> example, by
>>  drm “prime” multi-GPU support, but is of course not limited to 
>> GPU use
>>  cases."
>> 
>>  The problem is that right now DMABUF is only really used by DRM, 
>> and to
>>  quote Daniel, "dma-buf looks like something super generic and 
>> useful, until
>>  you realize that there's a metric ton of gpu/accelerator bagage 
>> piled in".
>> 
>>  Still, it seems to be the only viable option. We could add a custom
>>  buffer-passing interface, but that would mean implementing the same
>>  buffer-passing interface on the network and USB stacks, and before 
>> we know
>>  it we re-invented DMABUFs.
> 
> dma-buf also doesn't support sharing with network and usb stacks, so 
> I'm a
> bit confused why exactly this is useful?

There is an attempt to get dma-buf support in the network stack, called 
"zctap". Last patchset was sent last november. USB stack does not 
support dma-buf, but we can add it later I guess.

> So yeah unless there's some sharing going on with gpu stuff (for data
> processing maybe) I'm not sure this makes a lot of sense really. Or at
> least some zero-copy sharing between drivers, but even that would
> minimally require a dma-buf import ioctl of some sorts. Which I either
> missed or doesn't exist.

We do want zero-copy between drivers, the network stack, and the USB 
stack. It's not just about having a userspace interface.

> If there's none of that then just hand-roll your buffer handling code
> (xarray is cheap to use in terms of code for this), you can always add
> dma-buf import/export later on when the need arises.
> 
> Scrolling through patches you only have dma-buf export, but no 
> importing,
> so the use-case that works is with one of the existing subsystems that
> supporting dma-buf importing.
> 
> I think minimally we need the use-case (in form of code) that needs 
> the
> buffer sharing here.

I'll try with zctap and report back.

Cheers,
-Paul

>>  > >
>>  > >  Changes since v1:
>>  > >
>>  > >  - the patches that were merged in v1 have been (obviously) 
>> dropped
>>  > > from
>>  > >    this patchset;
>>  > >  - the patch that was setting the write-combine cache setting 
>> has
>>  > > been
>>  > >    dropped as well, as it was simply not useful.
>>  > >  - [01/12]:
>>  > >      * Only remove the outgoing queue, and keep the incoming 
>> queue,
>>  > > as we
>>  > >        want the buffer to start streaming data as soon as it is
>>  > > enabled.
>>  > >      * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now 
>> functionally
>>  > > the
>>  > >        same as IIO_BLOCK_STATE_DONE.
>>  > >  - [02/12]:
>>  > >      * Fix block->state not being reset in
>>  > >        iio_dma_buffer_request_update() for output buffers.
>>  > >      * Only update block->bytes_used once and add a comment 
>> about
>>  > > why we
>>  > >        update it.
>>  > >      * Add a comment about why we're setting a different state 
>> for
>>  > > output
>>  > >        buffers in iio_dma_buffer_request_update()
>>  > >      * Remove useless cast to bool (!!) in iio_dma_buffer_io()
>>  > >  - [05/12]:
>>  > >      Only allow the new IOCTLs on the buffer FD created with
>>  > >      IIO_BUFFER_GET_FD_IOCTL().
>>  > >  - [12/12]:
>>  > >      * Explicitly state that the new interface is optional and 
>> is
>>  > >        not implemented by all drivers.
>>  > >      * The IOCTLs can now only be called on the buffer FD 
>> returned by
>>  > >        IIO_BUFFER_GET_FD_IOCTL.
>>  > >      * Move the page up a bit in the index since it is core 
>> stuff
>>  > > and not
>>  > >        driver-specific.
>>  > >
>>  > >  The patches not listed here have not been modified since v1.
>>  > >
>>  > >  Cheers,
>>  > >  -Paul
>>  > >
>>  > >  Alexandru Ardelean (1):
>>  > >    iio: buffer-dma: split iio_dma_buffer_fileio_free() function
>>  > >
>>  > >  Paul Cercueil (11):
>>  > >    iio: buffer-dma: Get rid of outgoing queue
>>  > >    iio: buffer-dma: Enable buffer write support
>>  > >    iio: buffer-dmaengine: Support specifying buffer direction
>>  > >    iio: buffer-dmaengine: Enable write support
>>  > >    iio: core: Add new DMABUF interface infrastructure
>>  > >    iio: buffer-dma: Use DMABUFs instead of custom solution
>>  > >    iio: buffer-dma: Implement new DMABUF based userspace API
>>  > >    iio: buffer-dmaengine: Support new DMABUF based userspace API
>>  > >    iio: core: Add support for cyclic buffers
>>  > >    iio: buffer-dmaengine: Add support for cyclic buffers
>>  > >    Documentation: iio: Document high-speed DMABUF based API
>>  > >
>>  > >   Documentation/driver-api/dma-buf.rst          |   2 +
>>  > >   Documentation/iio/dmabuf_api.rst              |  94 +++
>>  > >   Documentation/iio/index.rst                   |   2 +
>>  > >   drivers/iio/adc/adi-axi-adc.c                 |   3 +-
>>  > >   drivers/iio/buffer/industrialio-buffer-dma.c  | 610
>>  > > ++++++++++++++----
>>  > >   .../buffer/industrialio-buffer-dmaengine.c    |  42 +-
>>  > >   drivers/iio/industrialio-buffer.c             |  60 ++
>>  > >   include/linux/iio/buffer-dma.h                |  38 +-
>>  > >   include/linux/iio/buffer-dmaengine.h          |   5 +-
>>  > >   include/linux/iio/buffer_impl.h               |   8 +
>>  > >   include/uapi/linux/iio/buffer.h               |  30 +
>>  > >   11 files changed, 749 insertions(+), 145 deletions(-)
>>  > >   create mode 100644 Documentation/iio/dmabuf_api.rst
>>  > >
>>  >
>> 
>> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API
  2022-03-29  8:54     ` Daniel Vetter
@ 2022-03-29  9:47       ` Paul Cercueil
  2022-03-29 14:07         ` Daniel Vetter
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-03-29  9:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jonathan Cameron, Michael Hennerich, Jonathan Corbet, linux-iio,
	linux-doc, linux-kernel, dri-devel, linaro-mm-sig,
	Alexandru Ardelean, Christian König

Hi Daniel,

Le mar., mars 29 2022 at 10:54:43 +0200, Daniel Vetter 
<daniel@ffwll.ch> a écrit :
> On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
>>  Document the new DMABUF based API.
>> 
>>  v2: - Explicitly state that the new interface is optional and is
>>        not implemented by all drivers.
>>      - The IOCTLs can now only be called on the buffer FD returned by
>>        IIO_BUFFER_GET_FD_IOCTL.
>>      - Move the page up a bit in the index since it is core stuff 
>> and not
>>        driver-specific.
>> 
>>  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
>>  ---
>>   Documentation/driver-api/dma-buf.rst |  2 +
>>   Documentation/iio/dmabuf_api.rst     | 94 
>> ++++++++++++++++++++++++++++
>>   Documentation/iio/index.rst          |  2 +
>>   3 files changed, 98 insertions(+)
>>   create mode 100644 Documentation/iio/dmabuf_api.rst
>> 
>>  diff --git a/Documentation/driver-api/dma-buf.rst 
>> b/Documentation/driver-api/dma-buf.rst
>>  index 2cd7db82d9fe..d3c9b58d2706 100644
>>  --- a/Documentation/driver-api/dma-buf.rst
>>  +++ b/Documentation/driver-api/dma-buf.rst
>>  @@ -1,3 +1,5 @@
>>  +.. _dma-buf:
>>  +
>>   Buffer Sharing and Synchronization
>>   ==================================
>> 
>>  diff --git a/Documentation/iio/dmabuf_api.rst 
>> b/Documentation/iio/dmabuf_api.rst
>>  new file mode 100644
>>  index 000000000000..43bb2c1b9fdc
>>  --- /dev/null
>>  +++ b/Documentation/iio/dmabuf_api.rst
>>  @@ -0,0 +1,94 @@
>>  +===================================
>>  +High-speed DMABUF interface for IIO
>>  +===================================
>>  +
>>  +1. Overview
>>  +===========
>>  +
>>  +The Industrial I/O subsystem supports access to buffers through a 
>> file-based
>>  +interface, with read() and write() access calls through the IIO 
>> device's dev
>>  +node.
>>  +
>>  +It additionally supports a DMABUF based interface, where the 
>> userspace
>>  +application can allocate and append DMABUF objects to the buffer's 
>> queue.
>>  +This interface is however optional and is not available in all 
>> drivers.
>>  +
>>  +The advantage of this DMABUF based interface vs. the read()
>>  +interface, is that it avoids an extra copy of the data between the
>>  +kernel and userspace. This is particularly useful for high-speed
>>  +devices which produce several megabytes or even gigabytes of data 
>> per
>>  +second.
>>  +
>>  +The data in this DMABUF interface is managed at the granularity of
>>  +DMABUF objects. Reducing the granularity from byte level to block 
>> level
>>  +is done to reduce the userspace-kernelspace synchronization 
>> overhead
>>  +since performing syscalls for each byte at a few Mbps is just not
>>  +feasible.
>>  +
>>  +This of course leads to a slightly increased latency. For this 
>> reason an
>>  +application can choose the size of the DMABUFs as well as how many 
>> it
>>  +allocates. E.g. two DMABUFs would be a traditional double buffering
>>  +scheme. But using a higher number might be necessary to avoid
>>  +underflow/overflow situations in the presence of scheduling 
>> latencies.
> 
> So this reads a lot like reinventing io-uring with pre-registered 
> O_DIRECT
> memory ranges. Except it's using dma-buf and hand-rolling a lot of 
> pieces
> instead of io-uring and O_DIRECT.

I don't see how io_uring would help us. It's an async I/O framework, 
does it allow us to access a kernel buffer without copying the data? 
Does it allow us to zero-copy the data to a network interface?

> At least if the entire justification for dma-buf support is zero-copy
> support between the driver and userspace it's _really_ not the right 
> tool
> for the job. dma-buf is for zero-copy between devices, with cpu access
> from userpace (or kernel fwiw) being very much the exception (and 
> often
> flat-out not supported at all).

We want both. Using dma-bufs for the driver/userspace interface is a 
convenience as we then have a unique API instead of two distinct ones.

Why should CPU access from userspace be the exception? It works fine 
for IIO dma-bufs. You keep warning about this being a terrible design, 
but I simply don't see it.

Cheers,
-Paul

>>  +
>>  +2. User API
>>  +===========
>>  +
>>  +``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)``
>>  +----------------------------------------------------------------
>>  +
>>  +Each call will allocate a new DMABUF object. The return value (if 
>> not
>>  +a negative errno value as error) will be the file descriptor of 
>> the new
>>  +DMABUF.
>>  +
>>  +``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)``
>>  +--------------------------------------------------------
>>  +
>>  +Place the DMABUF object into the queue pending for hardware 
>> process.
>>  +
>>  +These two IOCTLs have to be performed on the IIO buffer's file
>>  +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl.
>>  +
>>  +3. Usage
>>  +========
>>  +
>>  +To access the data stored in a block by userspace the block must be
>>  +mapped to the process's memory. This is done by calling mmap() on 
>> the
>>  +DMABUF's file descriptor.
>>  +
>>  +Before accessing the data through the map, you must use the
>>  +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
>>  +DMA_BUF_SYNC_START flag, to make sure that the data is available.
>>  +This call may block until the hardware is done with this block. 
>> Once
>>  +you are done reading or writing the data, you must use this ioctl 
>> again
>>  +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
>>  +kernel's queue.
>>  +
>>  +If you need to know when the hardware is done with a DMABUF, you 
>> can
>>  +poll its file descriptor for the EPOLLOUT event.
>>  +
>>  +Finally, to destroy a DMABUF object, simply call close() on its 
>> file
>>  +descriptor.
>>  +
>>  +For more information about manipulating DMABUF objects, see: 
>> :ref:`dma-buf`.
>>  +
>>  +A typical workflow for the new interface is:
>>  +
>>  +    for block in blocks:
>>  +      DMABUF_ALLOC block
>>  +      mmap block
>>  +
>>  +    enable buffer
>>  +
>>  +    while !done
>>  +      for block in blocks:
>>  +        DMABUF_ENQUEUE block
>>  +
>>  +        DMABUF_SYNC_START block
>>  +        process data
>>  +        DMABUF_SYNC_END block
>>  +
>>  +    disable buffer
>>  +
>>  +    for block in blocks:
>>  +      close block
>>  diff --git a/Documentation/iio/index.rst 
>> b/Documentation/iio/index.rst
>>  index 58b7a4ebac51..669deb67ddee 100644
>>  --- a/Documentation/iio/index.rst
>>  +++ b/Documentation/iio/index.rst
>>  @@ -9,4 +9,6 @@ Industrial I/O
>> 
>>      iio_configfs
>> 
>>  +   dmabuf_api
>>  +
>>      ep93xx_adc
>>  --
>>  2.34.1
>> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure
  2022-03-28 18:44     ` Paul Cercueil
@ 2022-03-29 13:36       ` Jonathan Cameron
  0 siblings, 0 replies; 41+ messages in thread
From: Jonathan Cameron @ 2022-03-29 13:36 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Jonathan Cameron, Michael Hennerich, Lars-Peter Clausen,
	Christian König, Sumit Semwal, Jonathan Corbet,
	Alexandru Ardelean, dri-devel, linaro-mm-sig, linux-doc,
	linux-kernel, linux-iio

On Mon, 28 Mar 2022 19:44:19 +0100
Paul Cercueil <paul@crapouillou.net> wrote:

> Hi Jonathan,
> 
> Le lun., mars 28 2022 at 18:37:01 +0100, Jonathan Cameron 
> <jic23@kernel.org> a écrit :
> > On Mon,  7 Feb 2022 12:59:26 +0000
> > Paul Cercueil <paul@crapouillou.net> wrote:
> >   
> >>  Add the necessary infrastructure to the IIO core to support a new
> >>  optional DMABUF based interface.
> >> 
> >>  The advantage of this new DMABUF based interface vs. the read()
> >>  interface, is that it avoids an extra copy of the data between the
> >>  kernel and userspace. This is particularly userful for high-speed  
> > 
> > useful
> >   
> >>  devices which produce several megabytes or even gigabytes of data 
> >> per
> >>  second.
> >> 
> >>  The data in this new DMABUF interface is managed at the granularity 
> >> of
> >>  DMABUF objects. Reducing the granularity from byte level to block 
> >> level
> >>  is done to reduce the userspace-kernelspace synchronization overhead
> >>  since performing syscalls for each byte at a few Mbps is just not
> >>  feasible.
> >> 
> >>  This of course leads to a slightly increased latency. For this 
> >> reason an
> >>  application can choose the size of the DMABUFs as well as how many 
> >> it
> >>  allocates. E.g. two DMABUFs would be a traditional double buffering
> >>  scheme. But using a higher number might be necessary to avoid
> >>  underflow/overflow situations in the presence of scheduling 
> >> latencies.
> >> 
> >>  As part of the interface, 2 new IOCTLs have been added:
> >> 
> >>  IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *):
> >>   Each call will allocate a new DMABUF object. The return value (if 
> >> not
> >>   a negative errno value as error) will be the file descriptor of 
> >> the new
> >>   DMABUF.
> >> 
> >>  IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
> >>   Place the DMABUF object into the queue pending for hardware 
> >> process.
> >> 
> >>  These two IOCTLs have to be performed on the IIO buffer's file
> >>  descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.  
> > 
> > Just to check, do they work on the old deprecated chardev route? 
> > Normally
> > we can directly access the first buffer without the ioctl.  
> 
> They do not. I think it's fine this way, since as you said, the old 
> chardev route is deprecated. But I can add support for it with enough 
> peer pressure.
Agreed. Definitely fine to not support the 'old way'.

J
> 
> >> 
> >>  To access the data stored in a block by userspace the block must be
> >>  mapped to the process's memory. This is done by calling mmap() on 
> >> the
> >>  DMABUF's file descriptor.
> >> 
> >>  Before accessing the data through the map, you must use the
> >>  DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
> >>  DMA_BUF_SYNC_START flag, to make sure that the data is available.
> >>  This call may block until the hardware is done with this block. Once
> >>  you are done reading or writing the data, you must use this ioctl 
> >> again
> >>  with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
> >>  kernel's queue.
> >> 
> >>  If you need to know when the hardware is done with a DMABUF, you can
> >>  poll its file descriptor for the EPOLLOUT event.
> >> 
> >>  Finally, to destroy a DMABUF object, simply call close() on its file
> >>  descriptor.
> >> 
> >>  A typical workflow for the new interface is:
> >> 
> >>    for block in blocks:
> >>      DMABUF_ALLOC block
> >>      mmap block
> >> 
> >>    enable buffer
> >> 
> >>    while !done
> >>      for block in blocks:
> >>        DMABUF_ENQUEUE block
> >> 
> >>        DMABUF_SYNC_START block
> >>        process data
> >>        DMABUF_SYNC_END block
> >> 
> >>    disable buffer
> >> 
> >>    for block in blocks:
> >>      close block  
> > 
> > Given my very limited knowledge of dma-buf, I'll leave commenting
> > on the flow to others who know if this looks 'standards' or not ;)
> > 
> > Code looks sane to me..  
> 
> Thanks.
> 
> Cheers,
> -Paul
> 
> >> 
> >>  v2: Only allow the new IOCTLs on the buffer FD created with
> >>      IIO_BUFFER_GET_FD_IOCTL().
> >> 
> >>  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> >>  ---
> >>   drivers/iio/industrialio-buffer.c | 55 
> >> +++++++++++++++++++++++++++++++
> >>   include/linux/iio/buffer_impl.h   |  8 +++++
> >>   include/uapi/linux/iio/buffer.h   | 29 ++++++++++++++++
> >>   3 files changed, 92 insertions(+)
> >> 
> >>  diff --git a/drivers/iio/industrialio-buffer.c 
> >> b/drivers/iio/industrialio-buffer.c
> >>  index 94eb9f6cf128..72f333a519bc 100644
> >>  --- a/drivers/iio/industrialio-buffer.c
> >>  +++ b/drivers/iio/industrialio-buffer.c
> >>  @@ -17,6 +17,7 @@
> >>   #include <linux/fs.h>
> >>   #include <linux/cdev.h>
> >>   #include <linux/slab.h>
> >>  +#include <linux/mm.h>
> >>   #include <linux/poll.h>
> >>   #include <linux/sched/signal.h>
> >> 
> >>  @@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct 
> >> inode *inode, struct file *filep)
> >>   	return 0;
> >>   }
> >> 
> >>  +static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
> >>  +				     struct iio_dmabuf __user *user_buf)
> >>  +{
> >>  +	struct iio_dmabuf dmabuf;
> >>  +
> >>  +	if (!buffer->access->enqueue_dmabuf)
> >>  +		return -EPERM;
> >>  +
> >>  +	if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf)))
> >>  +		return -EFAULT;
> >>  +
> >>  +	if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
> >>  +		return -EINVAL;
> >>  +
> >>  +	return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
> >>  +}
> >>  +
> >>  +static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer,
> >>  +				   struct iio_dmabuf_alloc_req __user *user_req)
> >>  +{
> >>  +	struct iio_dmabuf_alloc_req req;
> >>  +
> >>  +	if (!buffer->access->alloc_dmabuf)
> >>  +		return -EPERM;
> >>  +
> >>  +	if (copy_from_user(&req, user_req, sizeof(req)))
> >>  +		return -EFAULT;
> >>  +
> >>  +	if (req.resv)
> >>  +		return -EINVAL;
> >>  +
> >>  +	return buffer->access->alloc_dmabuf(buffer, &req);
> >>  +}
> >>  +
> >>  +static long iio_buffer_chrdev_ioctl(struct file *filp,
> >>  +				    unsigned int cmd, unsigned long arg)
> >>  +{
> >>  +	struct iio_dev_buffer_pair *ib = filp->private_data;
> >>  +	struct iio_buffer *buffer = ib->buffer;
> >>  +	void __user *_arg = (void __user *)arg;
> >>  +
> >>  +	switch (cmd) {
> >>  +	case IIO_BUFFER_DMABUF_ALLOC_IOCTL:
> >>  +		return iio_buffer_alloc_dmabuf(buffer, _arg);
> >>  +	case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL:
> >>  +		/* TODO: support non-blocking enqueue operation */
> >>  +		return iio_buffer_enqueue_dmabuf(buffer, _arg);
> >>  +	default:
> >>  +		return IIO_IOCTL_UNHANDLED;
> >>  +	}
> >>  +}
> >>  +
> >>   static const struct file_operations iio_buffer_chrdev_fileops = {
> >>   	.owner = THIS_MODULE,
> >>   	.llseek = noop_llseek,
> >>   	.read = iio_buffer_read,
> >>   	.write = iio_buffer_write,
> >>  +	.unlocked_ioctl = iio_buffer_chrdev_ioctl,
> >>  +	.compat_ioctl = compat_ptr_ioctl,
> >>   	.poll = iio_buffer_poll,
> >>   	.release = iio_buffer_chrdev_release,
> >>   };
> >>  diff --git a/include/linux/iio/buffer_impl.h 
> >> b/include/linux/iio/buffer_impl.h
> >>  index e2ca8ea23e19..728541bc2c63 100644
> >>  --- a/include/linux/iio/buffer_impl.h
> >>  +++ b/include/linux/iio/buffer_impl.h
> >>  @@ -39,6 +39,9 @@ struct iio_buffer;
> >>    *                      device stops sampling. Calles are balanced 
> >> with @enable.
> >>    * @release:		called when the last reference to the buffer is 
> >> dropped,
> >>    *			should free all resources allocated by the buffer.
> >>  + * @alloc_dmabuf:	called from userspace via ioctl to allocate one 
> >> DMABUF.
> >>  + * @enqueue_dmabuf:	called from userspace via ioctl to queue this 
> >> DMABUF
> >>  + *			object to this buffer. Requires a valid DMABUF fd.
> >>    * @modes:		Supported operating modes by this buffer type
> >>    * @flags:		A bitmask combination of INDIO_BUFFER_FLAG_*
> >>    *
> >>  @@ -68,6 +71,11 @@ struct iio_buffer_access_funcs {
> >> 
> >>   	void (*release)(struct iio_buffer *buffer);
> >> 
> >>  +	int (*alloc_dmabuf)(struct iio_buffer *buffer,
> >>  +			    struct iio_dmabuf_alloc_req *req);
> >>  +	int (*enqueue_dmabuf)(struct iio_buffer *buffer,
> >>  +			      struct iio_dmabuf *block);
> >>  +
> >>   	unsigned int modes;
> >>   	unsigned int flags;
> >>   };
> >>  diff --git a/include/uapi/linux/iio/buffer.h 
> >> b/include/uapi/linux/iio/buffer.h
> >>  index 13939032b3f6..e4621b926262 100644
> >>  --- a/include/uapi/linux/iio/buffer.h
> >>  +++ b/include/uapi/linux/iio/buffer.h
> >>  @@ -5,6 +5,35 @@
> >>   #ifndef _UAPI_IIO_BUFFER_H_
> >>   #define _UAPI_IIO_BUFFER_H_
> >> 
> >>  +#include <linux/types.h>
> >>  +
> >>  +#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS	0x00000000
> >>  +
> >>  +/**
> >>  + * struct iio_dmabuf_alloc_req - Descriptor for allocating IIO 
> >> DMABUFs
> >>  + * @size:	the size of a single DMABUF
> >>  + * @resv:	reserved
> >>  + */
> >>  +struct iio_dmabuf_alloc_req {
> >>  +	__u64 size;
> >>  +	__u64 resv;
> >>  +};
> >>  +
> >>  +/**
> >>  + * struct iio_dmabuf - Descriptor for a single IIO DMABUF object
> >>  + * @fd:		file descriptor of the DMABUF object
> >>  + * @flags:	one or more IIO_BUFFER_DMABUF_* flags
> >>  + * @bytes_used:	number of bytes used in this DMABUF for the data 
> >> transfer.
> >>  + *		If zero, the full buffer is used.
> >>  + */
> >>  +struct iio_dmabuf {
> >>  +	__u32 fd;
> >>  +	__u32 flags;
> >>  +	__u64 bytes_used;
> >>  +};
> >>  +
> >>   #define IIO_BUFFER_GET_FD_IOCTL			_IOWR('i', 0x91, int)
> >>  +#define IIO_BUFFER_DMABUF_ALLOC_IOCTL		_IOW('i', 0x92, struct 
> >> iio_dmabuf_alloc_req)
> >>  +#define IIO_BUFFER_DMABUF_ENQUEUE_IOCTL		_IOW('i', 0x93, struct 
> >> iio_dmabuf)
> >> 
> >>   #endif /* _UAPI_IIO_BUFFER_H_ */  
> >   
> 
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API
  2022-03-29  9:47       ` Paul Cercueil
@ 2022-03-29 14:07         ` Daniel Vetter
  2022-03-29 17:34           ` Paul Cercueil
  0 siblings, 1 reply; 41+ messages in thread
From: Daniel Vetter @ 2022-03-29 14:07 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Daniel Vetter, Jonathan Cameron, Michael Hennerich,
	Jonathan Corbet, linux-iio, linux-doc, linux-kernel, dri-devel,
	linaro-mm-sig, Alexandru Ardelean, Christian König

On Tue, Mar 29, 2022 at 10:47:23AM +0100, Paul Cercueil wrote:
> Hi Daniel,
> 
> Le mar., mars 29 2022 at 10:54:43 +0200, Daniel Vetter <daniel@ffwll.ch> a
> écrit :
> > On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
> > >  Document the new DMABUF based API.
> > > 
> > >  v2: - Explicitly state that the new interface is optional and is
> > >        not implemented by all drivers.
> > >      - The IOCTLs can now only be called on the buffer FD returned by
> > >        IIO_BUFFER_GET_FD_IOCTL.
> > >      - Move the page up a bit in the index since it is core stuff
> > > and not
> > >        driver-specific.
> > > 
> > >  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > >  ---
> > >   Documentation/driver-api/dma-buf.rst |  2 +
> > >   Documentation/iio/dmabuf_api.rst     | 94
> > > ++++++++++++++++++++++++++++
> > >   Documentation/iio/index.rst          |  2 +
> > >   3 files changed, 98 insertions(+)
> > >   create mode 100644 Documentation/iio/dmabuf_api.rst
> > > 
> > >  diff --git a/Documentation/driver-api/dma-buf.rst
> > > b/Documentation/driver-api/dma-buf.rst
> > >  index 2cd7db82d9fe..d3c9b58d2706 100644
> > >  --- a/Documentation/driver-api/dma-buf.rst
> > >  +++ b/Documentation/driver-api/dma-buf.rst
> > >  @@ -1,3 +1,5 @@
> > >  +.. _dma-buf:
> > >  +
> > >   Buffer Sharing and Synchronization
> > >   ==================================
> > > 
> > >  diff --git a/Documentation/iio/dmabuf_api.rst
> > > b/Documentation/iio/dmabuf_api.rst
> > >  new file mode 100644
> > >  index 000000000000..43bb2c1b9fdc
> > >  --- /dev/null
> > >  +++ b/Documentation/iio/dmabuf_api.rst
> > >  @@ -0,0 +1,94 @@
> > >  +===================================
> > >  +High-speed DMABUF interface for IIO
> > >  +===================================
> > >  +
> > >  +1. Overview
> > >  +===========
> > >  +
> > >  +The Industrial I/O subsystem supports access to buffers through a
> > > file-based
> > >  +interface, with read() and write() access calls through the IIO
> > > device's dev
> > >  +node.
> > >  +
> > >  +It additionally supports a DMABUF based interface, where the
> > > userspace
> > >  +application can allocate and append DMABUF objects to the buffer's
> > > queue.
> > >  +This interface is however optional and is not available in all
> > > drivers.
> > >  +
> > >  +The advantage of this DMABUF based interface vs. the read()
> > >  +interface, is that it avoids an extra copy of the data between the
> > >  +kernel and userspace. This is particularly useful for high-speed
> > >  +devices which produce several megabytes or even gigabytes of data
> > > per
> > >  +second.
> > >  +
> > >  +The data in this DMABUF interface is managed at the granularity of
> > >  +DMABUF objects. Reducing the granularity from byte level to block
> > > level
> > >  +is done to reduce the userspace-kernelspace synchronization
> > > overhead
> > >  +since performing syscalls for each byte at a few Mbps is just not
> > >  +feasible.
> > >  +
> > >  +This of course leads to a slightly increased latency. For this
> > > reason an
> > >  +application can choose the size of the DMABUFs as well as how many
> > > it
> > >  +allocates. E.g. two DMABUFs would be a traditional double buffering
> > >  +scheme. But using a higher number might be necessary to avoid
> > >  +underflow/overflow situations in the presence of scheduling
> > > latencies.
> > 
> > So this reads a lot like reinventing io-uring with pre-registered
> > O_DIRECT
> > memory ranges. Except it's using dma-buf and hand-rolling a lot of
> > pieces
> > instead of io-uring and O_DIRECT.
> 
> I don't see how io_uring would help us. It's an async I/O framework, does it
> allow us to access a kernel buffer without copying the data? Does it allow
> us to zero-copy the data to a network interface?

With networking, do you mean rdma, or some other kind of networking?
Anything else than rdma doesn't support dma-buf, and I don't think it will
likely ever do so. Similar it's really tricky to glue dma-buf support into
the block layer.

Wrt io_uring, yes it's async, but that's not the point. The point is that
with io_uring you pre-register ranges for reads and writes to target,
which in combination with O_DIRECT, makes it effectively (and efficient!)
zero-copy. Plus it has full integration with both networking and normal
file io, which dma-buf just doesn't have.

Like you _cannot_ do zero copy from a dma-buf into a normal file. You
absolutely can do the same with io_uring.

> > At least if the entire justification for dma-buf support is zero-copy
> > support between the driver and userspace it's _really_ not the right
> > tool
> > for the job. dma-buf is for zero-copy between devices, with cpu access
> > from userpace (or kernel fwiw) being very much the exception (and often
> > flat-out not supported at all).
> 
> We want both. Using dma-bufs for the driver/userspace interface is a
> convenience as we then have a unique API instead of two distinct ones.
> 
> Why should CPU access from userspace be the exception? It works fine for IIO
> dma-bufs. You keep warning about this being a terrible design, but I simply
> don't see it.

It depends really on what you're trying to do, and there's extremely high
chances it will simply not work.

Unless you want to do zero copy with a gpu, or something which is in that
ecosystem of accelerators and devices, then dma-buf is probably not what
you're looking for.
-Daniel

> 
> Cheers,
> -Paul
> 
> > >  +
> > >  +2. User API
> > >  +===========
> > >  +
> > >  +``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)``
> > >  +----------------------------------------------------------------
> > >  +
> > >  +Each call will allocate a new DMABUF object. The return value (if
> > > not
> > >  +a negative errno value as error) will be the file descriptor of
> > > the new
> > >  +DMABUF.
> > >  +
> > >  +``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)``
> > >  +--------------------------------------------------------
> > >  +
> > >  +Place the DMABUF object into the queue pending for hardware
> > > process.
> > >  +
> > >  +These two IOCTLs have to be performed on the IIO buffer's file
> > >  +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl.
> > >  +
> > >  +3. Usage
> > >  +========
> > >  +
> > >  +To access the data stored in a block by userspace the block must be
> > >  +mapped to the process's memory. This is done by calling mmap() on
> > > the
> > >  +DMABUF's file descriptor.
> > >  +
> > >  +Before accessing the data through the map, you must use the
> > >  +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
> > >  +DMA_BUF_SYNC_START flag, to make sure that the data is available.
> > >  +This call may block until the hardware is done with this block.
> > > Once
> > >  +you are done reading or writing the data, you must use this ioctl
> > > again
> > >  +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
> > >  +kernel's queue.
> > >  +
> > >  +If you need to know when the hardware is done with a DMABUF, you
> > > can
> > >  +poll its file descriptor for the EPOLLOUT event.
> > >  +
> > >  +Finally, to destroy a DMABUF object, simply call close() on its
> > > file
> > >  +descriptor.
> > >  +
> > >  +For more information about manipulating DMABUF objects, see:
> > > :ref:`dma-buf`.
> > >  +
> > >  +A typical workflow for the new interface is:
> > >  +
> > >  +    for block in blocks:
> > >  +      DMABUF_ALLOC block
> > >  +      mmap block
> > >  +
> > >  +    enable buffer
> > >  +
> > >  +    while !done
> > >  +      for block in blocks:
> > >  +        DMABUF_ENQUEUE block
> > >  +
> > >  +        DMABUF_SYNC_START block
> > >  +        process data
> > >  +        DMABUF_SYNC_END block
> > >  +
> > >  +    disable buffer
> > >  +
> > >  +    for block in blocks:
> > >  +      close block
> > >  diff --git a/Documentation/iio/index.rst
> > > b/Documentation/iio/index.rst
> > >  index 58b7a4ebac51..669deb67ddee 100644
> > >  --- a/Documentation/iio/index.rst
> > >  +++ b/Documentation/iio/index.rst
> > >  @@ -9,4 +9,6 @@ Industrial I/O
> > > 
> > >      iio_configfs
> > > 
> > >  +   dmabuf_api
> > >  +
> > >      ep93xx_adc
> > >  --
> > >  2.34.1
> > > 
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
  2022-03-29  9:11       ` Paul Cercueil
@ 2022-03-29 14:10         ` Daniel Vetter
  2022-03-29 17:16           ` Paul Cercueil
  0 siblings, 1 reply; 41+ messages in thread
From: Daniel Vetter @ 2022-03-29 14:10 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Daniel Vetter, Jonathan Cameron, Michael Hennerich,
	Jonathan Corbet, linux-iio, linux-doc, linux-kernel, dri-devel,
	Sumit Semwal, linaro-mm-sig, Alexandru Ardelean,
	Christian König

On Tue, Mar 29, 2022 at 10:11:14AM +0100, Paul Cercueil wrote:
> Hi Daniel,
> 
> Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter <daniel@ffwll.ch> a
> écrit :
> > On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
> > >  Hi Jonathan,
> > > 
> > >  Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron
> > >  <jic23@kernel.org> a écrit :
> > >  > On Mon,  7 Feb 2022 12:59:21 +0000
> > >  > Paul Cercueil <paul@crapouillou.net> wrote:
> > >  >
> > >  > >  Hi Jonathan,
> > >  > >
> > >  > >  This is the V2 of my patchset that introduces a new userspace
> > >  > > interface
> > >  > >  based on DMABUF objects to complement the fileio API, and adds
> > >  > > write()
> > >  > >  support to the existing fileio API.
> > >  >
> > >  > Hi Paul,
> > >  >
> > >  > It's been a little while. Perhaps you could summarize the various
> > > view
> > >  > points around the appropriateness of using DMABUF for this?
> > >  > I appreciate it is a tricky topic to distil into a brief summary
> > > but
> > >  > I know I would find it useful even if no one else does!
> > > 
> > >  So we want to have a high-speed interface where buffers of samples
> > > are
> > >  passed around between IIO devices and other devices (e.g. USB or
> > > network),
> > >  or made available to userspace without copying the data.
> > > 
> > >  DMABUF is, at least in theory, exactly what we need. Quoting the
> > >  documentation
> > >  (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
> > >  "The dma-buf subsystem provides the framework for sharing buffers
> > > for
> > >  hardware (DMA) access across multiple device drivers and
> > > subsystems, and for
> > >  synchronizing asynchronous hardware access. This is used, for
> > > example, by
> > >  drm “prime” multi-GPU support, but is of course not limited to GPU
> > > use
> > >  cases."
> > > 
> > >  The problem is that right now DMABUF is only really used by DRM,
> > > and to
> > >  quote Daniel, "dma-buf looks like something super generic and
> > > useful, until
> > >  you realize that there's a metric ton of gpu/accelerator bagage
> > > piled in".
> > > 
> > >  Still, it seems to be the only viable option. We could add a custom
> > >  buffer-passing interface, but that would mean implementing the same
> > >  buffer-passing interface on the network and USB stacks, and before
> > > we know
> > >  it we re-invented DMABUFs.
> > 
> > dma-buf also doesn't support sharing with network and usb stacks, so I'm
> > a
> > bit confused why exactly this is useful?
> 
> There is an attempt to get dma-buf support in the network stack, called
> "zctap". Last patchset was sent last november. USB stack does not support
> dma-buf, but we can add it later I guess.
> 
> > So yeah unless there's some sharing going on with gpu stuff (for data
> > processing maybe) I'm not sure this makes a lot of sense really. Or at
> > least some zero-copy sharing between drivers, but even that would
> > minimally require a dma-buf import ioctl of some sorts. Which I either
> > missed or doesn't exist.
> 
> We do want zero-copy between drivers, the network stack, and the USB stack.
> It's not just about having a userspace interface.

I think in that case we need these other pieces too. And we need acks from
relevant subsystems that these other pieces are a) ready for upstream
merging and also that the dma-buf side of things actually makes sense.

> > If there's none of that then just hand-roll your buffer handling code
> > (xarray is cheap to use in terms of code for this), you can always add
> > dma-buf import/export later on when the need arises.
> > 
> > Scrolling through patches you only have dma-buf export, but no
> > importing,
> > so the use-case that works is with one of the existing subsystems that
> > supporting dma-buf importing.
> > 
> > I think minimally we need the use-case (in form of code) that needs the
> > buffer sharing here.
> 
> I'll try with zctap and report back.

Do you have a link for this? I just checked dri-devel on lore, and it's
not there. Nor anywhere else.

We really need all the pieces, and if block layer reaction is anything to
judge by, dma-buf wont happen for networking either. There's some really
nasty and fairly fundamental issues with locking and memory reclaim that
make this utter pain or outright impossible.
-Daniel

> 
> Cheers,
> -Paul
> 
> > >  > >
> > >  > >  Changes since v1:
> > >  > >
> > >  > >  - the patches that were merged in v1 have been (obviously)
> > > dropped
> > >  > > from
> > >  > >    this patchset;
> > >  > >  - the patch that was setting the write-combine cache setting
> > > has
> > >  > > been
> > >  > >    dropped as well, as it was simply not useful.
> > >  > >  - [01/12]:
> > >  > >      * Only remove the outgoing queue, and keep the incoming
> > > queue,
> > >  > > as we
> > >  > >        want the buffer to start streaming data as soon as it is
> > >  > > enabled.
> > >  > >      * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now
> > > functionally
> > >  > > the
> > >  > >        same as IIO_BLOCK_STATE_DONE.
> > >  > >  - [02/12]:
> > >  > >      * Fix block->state not being reset in
> > >  > >        iio_dma_buffer_request_update() for output buffers.
> > >  > >      * Only update block->bytes_used once and add a comment
> > > about
> > >  > > why we
> > >  > >        update it.
> > >  > >      * Add a comment about why we're setting a different state
> > > for
> > >  > > output
> > >  > >        buffers in iio_dma_buffer_request_update()
> > >  > >      * Remove useless cast to bool (!!) in iio_dma_buffer_io()
> > >  > >  - [05/12]:
> > >  > >      Only allow the new IOCTLs on the buffer FD created with
> > >  > >      IIO_BUFFER_GET_FD_IOCTL().
> > >  > >  - [12/12]:
> > >  > >      * Explicitly state that the new interface is optional and
> > > is
> > >  > >        not implemented by all drivers.
> > >  > >      * The IOCTLs can now only be called on the buffer FD
> > > returned by
> > >  > >        IIO_BUFFER_GET_FD_IOCTL.
> > >  > >      * Move the page up a bit in the index since it is core
> > > stuff
> > >  > > and not
> > >  > >        driver-specific.
> > >  > >
> > >  > >  The patches not listed here have not been modified since v1.
> > >  > >
> > >  > >  Cheers,
> > >  > >  -Paul
> > >  > >
> > >  > >  Alexandru Ardelean (1):
> > >  > >    iio: buffer-dma: split iio_dma_buffer_fileio_free() function
> > >  > >
> > >  > >  Paul Cercueil (11):
> > >  > >    iio: buffer-dma: Get rid of outgoing queue
> > >  > >    iio: buffer-dma: Enable buffer write support
> > >  > >    iio: buffer-dmaengine: Support specifying buffer direction
> > >  > >    iio: buffer-dmaengine: Enable write support
> > >  > >    iio: core: Add new DMABUF interface infrastructure
> > >  > >    iio: buffer-dma: Use DMABUFs instead of custom solution
> > >  > >    iio: buffer-dma: Implement new DMABUF based userspace API
> > >  > >    iio: buffer-dmaengine: Support new DMABUF based userspace API
> > >  > >    iio: core: Add support for cyclic buffers
> > >  > >    iio: buffer-dmaengine: Add support for cyclic buffers
> > >  > >    Documentation: iio: Document high-speed DMABUF based API
> > >  > >
> > >  > >   Documentation/driver-api/dma-buf.rst          |   2 +
> > >  > >   Documentation/iio/dmabuf_api.rst              |  94 +++
> > >  > >   Documentation/iio/index.rst                   |   2 +
> > >  > >   drivers/iio/adc/adi-axi-adc.c                 |   3 +-
> > >  > >   drivers/iio/buffer/industrialio-buffer-dma.c  | 610
> > >  > > ++++++++++++++----
> > >  > >   .../buffer/industrialio-buffer-dmaengine.c    |  42 +-
> > >  > >   drivers/iio/industrialio-buffer.c             |  60 ++
> > >  > >   include/linux/iio/buffer-dma.h                |  38 +-
> > >  > >   include/linux/iio/buffer-dmaengine.h          |   5 +-
> > >  > >   include/linux/iio/buffer_impl.h               |   8 +
> > >  > >   include/uapi/linux/iio/buffer.h               |  30 +
> > >  > >   11 files changed, 749 insertions(+), 145 deletions(-)
> > >  > >   create mode 100644 Documentation/iio/dmabuf_api.rst
> > >  > >
> > >  >
> > > 
> > > 
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
  2022-03-29 14:10         ` Daniel Vetter
@ 2022-03-29 17:16           ` Paul Cercueil
  2022-03-30  9:19             ` Daniel Vetter
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-03-29 17:16 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jonathan Cameron, Jonathan Lemon, Michael Hennerich,
	Jonathan Corbet, linux-iio, linux-doc, linux-kernel, dri-devel,
	Sumit Semwal, linaro-mm-sig, Alexandru Ardelean,
	Christian König

Hi Daniel,

Le mar., mars 29 2022 at 16:10:44 +0200, Daniel Vetter 
<daniel@ffwll.ch> a écrit :
> On Tue, Mar 29, 2022 at 10:11:14AM +0100, Paul Cercueil wrote:
>>  Hi Daniel,
>> 
>>  Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter 
>> <daniel@ffwll.ch> a
>>  écrit :
>>  > On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
>>  > >  Hi Jonathan,
>>  > >
>>  > >  Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron
>>  > >  <jic23@kernel.org> a écrit :
>>  > >  > On Mon,  7 Feb 2022 12:59:21 +0000
>>  > >  > Paul Cercueil <paul@crapouillou.net> wrote:
>>  > >  >
>>  > >  > >  Hi Jonathan,
>>  > >  > >
>>  > >  > >  This is the V2 of my patchset that introduces a new 
>> userspace
>>  > >  > > interface
>>  > >  > >  based on DMABUF objects to complement the fileio API, and 
>> adds
>>  > >  > > write()
>>  > >  > >  support to the existing fileio API.
>>  > >  >
>>  > >  > Hi Paul,
>>  > >  >
>>  > >  > It's been a little while. Perhaps you could summarize the 
>> various
>>  > > view
>>  > >  > points around the appropriateness of using DMABUF for this?
>>  > >  > I appreciate it is a tricky topic to distil into a brief 
>> summary
>>  > > but
>>  > >  > I know I would find it useful even if no one else does!
>>  > >
>>  > >  So we want to have a high-speed interface where buffers of 
>> samples
>>  > > are
>>  > >  passed around between IIO devices and other devices (e.g. USB 
>> or
>>  > > network),
>>  > >  or made available to userspace without copying the data.
>>  > >
>>  > >  DMABUF is, at least in theory, exactly what we need. Quoting 
>> the
>>  > >  documentation
>>  > >  
>> (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
>>  > >  "The dma-buf subsystem provides the framework for sharing 
>> buffers
>>  > > for
>>  > >  hardware (DMA) access across multiple device drivers and
>>  > > subsystems, and for
>>  > >  synchronizing asynchronous hardware access. This is used, for
>>  > > example, by
>>  > >  drm “prime” multi-GPU support, but is of course not 
>> limited to GPU
>>  > > use
>>  > >  cases."
>>  > >
>>  > >  The problem is that right now DMABUF is only really used by 
>> DRM,
>>  > > and to
>>  > >  quote Daniel, "dma-buf looks like something super generic and
>>  > > useful, until
>>  > >  you realize that there's a metric ton of gpu/accelerator bagage
>>  > > piled in".
>>  > >
>>  > >  Still, it seems to be the only viable option. We could add a 
>> custom
>>  > >  buffer-passing interface, but that would mean implementing the 
>> same
>>  > >  buffer-passing interface on the network and USB stacks, and 
>> before
>>  > > we know
>>  > >  it we re-invented DMABUFs.
>>  >
>>  > dma-buf also doesn't support sharing with network and usb stacks, 
>> so I'm
>>  > a
>>  > bit confused why exactly this is useful?
>> 
>>  There is an attempt to get dma-buf support in the network stack, 
>> called
>>  "zctap". Last patchset was sent last november. USB stack does not 
>> support
>>  dma-buf, but we can add it later I guess.
>> 
>>  > So yeah unless there's some sharing going on with gpu stuff (for 
>> data
>>  > processing maybe) I'm not sure this makes a lot of sense really. 
>> Or at
>>  > least some zero-copy sharing between drivers, but even that would
>>  > minimally require a dma-buf import ioctl of some sorts. Which I 
>> either
>>  > missed or doesn't exist.
>> 
>>  We do want zero-copy between drivers, the network stack, and the 
>> USB stack.
>>  It's not just about having a userspace interface.
> 
> I think in that case we need these other pieces too. And we need acks 
> from
> relevant subsystems that these other pieces are a) ready for upstream
> merging and also that the dma-buf side of things actually makes sense.

Ok...

>>  > If there's none of that then just hand-roll your buffer handling 
>> code
>>  > (xarray is cheap to use in terms of code for this), you can 
>> always add
>>  > dma-buf import/export later on when the need arises.
>>  >
>>  > Scrolling through patches you only have dma-buf export, but no
>>  > importing,
>>  > so the use-case that works is with one of the existing subsystems 
>> that
>>  > supporting dma-buf importing.
>>  >
>>  > I think minimally we need the use-case (in form of code) that 
>> needs the
>>  > buffer sharing here.
>> 
>>  I'll try with zctap and report back.
> 
> Do you have a link for this? I just checked dri-devel on lore, and 
> it's
> not there. Nor anywhere else.

The code is here: https://github.com/jlemon/zctap_kernel

I know Jonathan Lemon (Cc'd) was working on upstreaming it, I saw a few 
patchsets.

Cheers,
-Paul

> We really need all the pieces, and if block layer reaction is 
> anything to
> judge by, dma-buf wont happen for networking either. There's some 
> really
> nasty and fairly fundamental issues with locking and memory reclaim 
> that
> make this utter pain or outright impossible.
> -Daniel
> 
>> 
>>  Cheers,
>>  -Paul
>> 
>>  > >  > >
>>  > >  > >  Changes since v1:
>>  > >  > >
>>  > >  > >  - the patches that were merged in v1 have been (obviously)
>>  > > dropped
>>  > >  > > from
>>  > >  > >    this patchset;
>>  > >  > >  - the patch that was setting the write-combine cache 
>> setting
>>  > > has
>>  > >  > > been
>>  > >  > >    dropped as well, as it was simply not useful.
>>  > >  > >  - [01/12]:
>>  > >  > >      * Only remove the outgoing queue, and keep the 
>> incoming
>>  > > queue,
>>  > >  > > as we
>>  > >  > >        want the buffer to start streaming data as soon as 
>> it is
>>  > >  > > enabled.
>>  > >  > >      * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now
>>  > > functionally
>>  > >  > > the
>>  > >  > >        same as IIO_BLOCK_STATE_DONE.
>>  > >  > >  - [02/12]:
>>  > >  > >      * Fix block->state not being reset in
>>  > >  > >        iio_dma_buffer_request_update() for output buffers.
>>  > >  > >      * Only update block->bytes_used once and add a comment
>>  > > about
>>  > >  > > why we
>>  > >  > >        update it.
>>  > >  > >      * Add a comment about why we're setting a different 
>> state
>>  > > for
>>  > >  > > output
>>  > >  > >        buffers in iio_dma_buffer_request_update()
>>  > >  > >      * Remove useless cast to bool (!!) in 
>> iio_dma_buffer_io()
>>  > >  > >  - [05/12]:
>>  > >  > >      Only allow the new IOCTLs on the buffer FD created 
>> with
>>  > >  > >      IIO_BUFFER_GET_FD_IOCTL().
>>  > >  > >  - [12/12]:
>>  > >  > >      * Explicitly state that the new interface is optional 
>> and
>>  > > is
>>  > >  > >        not implemented by all drivers.
>>  > >  > >      * The IOCTLs can now only be called on the buffer FD
>>  > > returned by
>>  > >  > >        IIO_BUFFER_GET_FD_IOCTL.
>>  > >  > >      * Move the page up a bit in the index since it is core
>>  > > stuff
>>  > >  > > and not
>>  > >  > >        driver-specific.
>>  > >  > >
>>  > >  > >  The patches not listed here have not been modified since 
>> v1.
>>  > >  > >
>>  > >  > >  Cheers,
>>  > >  > >  -Paul
>>  > >  > >
>>  > >  > >  Alexandru Ardelean (1):
>>  > >  > >    iio: buffer-dma: split iio_dma_buffer_fileio_free() 
>> function
>>  > >  > >
>>  > >  > >  Paul Cercueil (11):
>>  > >  > >    iio: buffer-dma: Get rid of outgoing queue
>>  > >  > >    iio: buffer-dma: Enable buffer write support
>>  > >  > >    iio: buffer-dmaengine: Support specifying buffer 
>> direction
>>  > >  > >    iio: buffer-dmaengine: Enable write support
>>  > >  > >    iio: core: Add new DMABUF interface infrastructure
>>  > >  > >    iio: buffer-dma: Use DMABUFs instead of custom solution
>>  > >  > >    iio: buffer-dma: Implement new DMABUF based userspace 
>> API
>>  > >  > >    iio: buffer-dmaengine: Support new DMABUF based 
>> userspace API
>>  > >  > >    iio: core: Add support for cyclic buffers
>>  > >  > >    iio: buffer-dmaengine: Add support for cyclic buffers
>>  > >  > >    Documentation: iio: Document high-speed DMABUF based API
>>  > >  > >
>>  > >  > >   Documentation/driver-api/dma-buf.rst          |   2 +
>>  > >  > >   Documentation/iio/dmabuf_api.rst              |  94 +++
>>  > >  > >   Documentation/iio/index.rst                   |   2 +
>>  > >  > >   drivers/iio/adc/adi-axi-adc.c                 |   3 +-
>>  > >  > >   drivers/iio/buffer/industrialio-buffer-dma.c  | 610
>>  > >  > > ++++++++++++++----
>>  > >  > >   .../buffer/industrialio-buffer-dmaengine.c    |  42 +-
>>  > >  > >   drivers/iio/industrialio-buffer.c             |  60 ++
>>  > >  > >   include/linux/iio/buffer-dma.h                |  38 +-
>>  > >  > >   include/linux/iio/buffer-dmaengine.h          |   5 +-
>>  > >  > >   include/linux/iio/buffer_impl.h               |   8 +
>>  > >  > >   include/uapi/linux/iio/buffer.h               |  30 +
>>  > >  > >   11 files changed, 749 insertions(+), 145 deletions(-)
>>  > >  > >   create mode 100644 Documentation/iio/dmabuf_api.rst
>>  > >  > >
>>  > >  >
>>  > >
>>  > >
>>  >
>>  > --
>>  > Daniel Vetter
>>  > Software Engineer, Intel Corporation
>>  > http://blog.ffwll.ch
>> 
>> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API
  2022-03-29 14:07         ` Daniel Vetter
@ 2022-03-29 17:34           ` Paul Cercueil
  2022-03-30  9:22             ` Daniel Vetter
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Cercueil @ 2022-03-29 17:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jonathan Cameron, Jonathan Lemon, Michael Hennerich,
	Jonathan Corbet, linux-iio, linux-doc, linux-kernel, dri-devel,
	linaro-mm-sig, Alexandru Ardelean, Christian König



Le mar., mars 29 2022 at 16:07:21 +0200, Daniel Vetter 
<daniel@ffwll.ch> a écrit :
> On Tue, Mar 29, 2022 at 10:47:23AM +0100, Paul Cercueil wrote:
>>  Hi Daniel,
>> 
>>  Le mar., mars 29 2022 at 10:54:43 +0200, Daniel Vetter 
>> <daniel@ffwll.ch> a
>>  écrit :
>>  > On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
>>  > >  Document the new DMABUF based API.
>>  > >
>>  > >  v2: - Explicitly state that the new interface is optional and 
>> is
>>  > >        not implemented by all drivers.
>>  > >      - The IOCTLs can now only be called on the buffer FD 
>> returned by
>>  > >        IIO_BUFFER_GET_FD_IOCTL.
>>  > >      - Move the page up a bit in the index since it is core 
>> stuff
>>  > > and not
>>  > >        driver-specific.
>>  > >
>>  > >  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
>>  > >  ---
>>  > >   Documentation/driver-api/dma-buf.rst |  2 +
>>  > >   Documentation/iio/dmabuf_api.rst     | 94
>>  > > ++++++++++++++++++++++++++++
>>  > >   Documentation/iio/index.rst          |  2 +
>>  > >   3 files changed, 98 insertions(+)
>>  > >   create mode 100644 Documentation/iio/dmabuf_api.rst
>>  > >
>>  > >  diff --git a/Documentation/driver-api/dma-buf.rst
>>  > > b/Documentation/driver-api/dma-buf.rst
>>  > >  index 2cd7db82d9fe..d3c9b58d2706 100644
>>  > >  --- a/Documentation/driver-api/dma-buf.rst
>>  > >  +++ b/Documentation/driver-api/dma-buf.rst
>>  > >  @@ -1,3 +1,5 @@
>>  > >  +.. _dma-buf:
>>  > >  +
>>  > >   Buffer Sharing and Synchronization
>>  > >   ==================================
>>  > >
>>  > >  diff --git a/Documentation/iio/dmabuf_api.rst
>>  > > b/Documentation/iio/dmabuf_api.rst
>>  > >  new file mode 100644
>>  > >  index 000000000000..43bb2c1b9fdc
>>  > >  --- /dev/null
>>  > >  +++ b/Documentation/iio/dmabuf_api.rst
>>  > >  @@ -0,0 +1,94 @@
>>  > >  +===================================
>>  > >  +High-speed DMABUF interface for IIO
>>  > >  +===================================
>>  > >  +
>>  > >  +1. Overview
>>  > >  +===========
>>  > >  +
>>  > >  +The Industrial I/O subsystem supports access to buffers 
>> through a
>>  > > file-based
>>  > >  +interface, with read() and write() access calls through the 
>> IIO
>>  > > device's dev
>>  > >  +node.
>>  > >  +
>>  > >  +It additionally supports a DMABUF based interface, where the
>>  > > userspace
>>  > >  +application can allocate and append DMABUF objects to the 
>> buffer's
>>  > > queue.
>>  > >  +This interface is however optional and is not available in all
>>  > > drivers.
>>  > >  +
>>  > >  +The advantage of this DMABUF based interface vs. the read()
>>  > >  +interface, is that it avoids an extra copy of the data 
>> between the
>>  > >  +kernel and userspace. This is particularly useful for 
>> high-speed
>>  > >  +devices which produce several megabytes or even gigabytes of 
>> data
>>  > > per
>>  > >  +second.
>>  > >  +
>>  > >  +The data in this DMABUF interface is managed at the 
>> granularity of
>>  > >  +DMABUF objects. Reducing the granularity from byte level to 
>> block
>>  > > level
>>  > >  +is done to reduce the userspace-kernelspace synchronization
>>  > > overhead
>>  > >  +since performing syscalls for each byte at a few Mbps is just 
>> not
>>  > >  +feasible.
>>  > >  +
>>  > >  +This of course leads to a slightly increased latency. For this
>>  > > reason an
>>  > >  +application can choose the size of the DMABUFs as well as how 
>> many
>>  > > it
>>  > >  +allocates. E.g. two DMABUFs would be a traditional double 
>> buffering
>>  > >  +scheme. But using a higher number might be necessary to avoid
>>  > >  +underflow/overflow situations in the presence of scheduling
>>  > > latencies.
>>  >
>>  > So this reads a lot like reinventing io-uring with pre-registered
>>  > O_DIRECT
>>  > memory ranges. Except it's using dma-buf and hand-rolling a lot of
>>  > pieces
>>  > instead of io-uring and O_DIRECT.
>> 
>>  I don't see how io_uring would help us. It's an async I/O 
>> framework, does it
>>  allow us to access a kernel buffer without copying the data? Does 
>> it allow
>>  us to zero-copy the data to a network interface?
> 
> With networking, do you mean rdma, or some other kind of networking?
> Anything else than rdma doesn't support dma-buf, and I don't think it 
> will
> likely ever do so. Similar it's really tricky to glue dma-buf support 
> into
> the block layer.

By networking I mean standard sockets. If I'm not mistaken, Jonathan 
Lemon's work on zctap was to add dma-buf import/export support to 
standard sockets.

> Wrt io_uring, yes it's async, but that's not the point. The point is 
> that
> with io_uring you pre-register ranges for reads and writes to target,
> which in combination with O_DIRECT, makes it effectively (and 
> efficient!)
> zero-copy. Plus it has full integration with both networking and 
> normal
> file io, which dma-buf just doesn't have.
> 
> Like you _cannot_ do zero copy from a dma-buf into a normal file. You
> absolutely can do the same with io_uring.

I believe io_uring does zero-copy the same way as splice(), by 
duplicating/moving pages? Because that wouldn't work with DMA coherent 
memory, which is contiguous and not backed by pages.

>>  > At least if the entire justification for dma-buf support is 
>> zero-copy
>>  > support between the driver and userspace it's _really_ not the 
>> right
>>  > tool
>>  > for the job. dma-buf is for zero-copy between devices, with cpu 
>> access
>>  > from userpace (or kernel fwiw) being very much the exception (and 
>> often
>>  > flat-out not supported at all).
>> 
>>  We want both. Using dma-bufs for the driver/userspace interface is a
>>  convenience as we then have a unique API instead of two distinct 
>> ones.
>> 
>>  Why should CPU access from userspace be the exception? It works 
>> fine for IIO
>>  dma-bufs. You keep warning about this being a terrible design, but 
>> I simply
>>  don't see it.
> 
> It depends really on what you're trying to do, and there's extremely 
> high
> chances it will simply not work.

Well it does work though. The userspace interface is stupidly simple 
here - one dma-buf, backed by DMA coherent memory, is enqueued for 
processing by the DMA. The userspace calling the "sync" ioctl on the 
dma-buf will block until the transfer is complete, and then userspace 
can access it again.


> Unless you want to do zero copy with a gpu, or something which is in 
> that
> ecosystem of accelerators and devices, then dma-buf is probably not 
> what
> you're looking for.
> -Daniel

I want to do zero-copy between a IIO device and the network/USB, and 
right now there is absolutely nothing in place that allows me to do 
that. So I have to get creative.

Cheers,
-Paul

>> 
>>  > >  +
>>  > >  +2. User API
>>  > >  +===========
>>  > >  +
>>  > >  +``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req 
>> *)``
>>  > >  
>> +----------------------------------------------------------------
>>  > >  +
>>  > >  +Each call will allocate a new DMABUF object. The return value 
>> (if
>>  > > not
>>  > >  +a negative errno value as error) will be the file descriptor 
>> of
>>  > > the new
>>  > >  +DMABUF.
>>  > >  +
>>  > >  +``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)``
>>  > >  +--------------------------------------------------------
>>  > >  +
>>  > >  +Place the DMABUF object into the queue pending for hardware
>>  > > process.
>>  > >  +
>>  > >  +These two IOCTLs have to be performed on the IIO buffer's file
>>  > >  +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` 
>> ioctl.
>>  > >  +
>>  > >  +3. Usage
>>  > >  +========
>>  > >  +
>>  > >  +To access the data stored in a block by userspace the block 
>> must be
>>  > >  +mapped to the process's memory. This is done by calling 
>> mmap() on
>>  > > the
>>  > >  +DMABUF's file descriptor.
>>  > >  +
>>  > >  +Before accessing the data through the map, you must use the
>>  > >  +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
>>  > >  +DMA_BUF_SYNC_START flag, to make sure that the data is 
>> available.
>>  > >  +This call may block until the hardware is done with this 
>> block.
>>  > > Once
>>  > >  +you are done reading or writing the data, you must use this 
>> ioctl
>>  > > again
>>  > >  +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF 
>> to the
>>  > >  +kernel's queue.
>>  > >  +
>>  > >  +If you need to know when the hardware is done with a DMABUF, 
>> you
>>  > > can
>>  > >  +poll its file descriptor for the EPOLLOUT event.
>>  > >  +
>>  > >  +Finally, to destroy a DMABUF object, simply call close() on 
>> its
>>  > > file
>>  > >  +descriptor.
>>  > >  +
>>  > >  +For more information about manipulating DMABUF objects, see:
>>  > > :ref:`dma-buf`.
>>  > >  +
>>  > >  +A typical workflow for the new interface is:
>>  > >  +
>>  > >  +    for block in blocks:
>>  > >  +      DMABUF_ALLOC block
>>  > >  +      mmap block
>>  > >  +
>>  > >  +    enable buffer
>>  > >  +
>>  > >  +    while !done
>>  > >  +      for block in blocks:
>>  > >  +        DMABUF_ENQUEUE block
>>  > >  +
>>  > >  +        DMABUF_SYNC_START block
>>  > >  +        process data
>>  > >  +        DMABUF_SYNC_END block
>>  > >  +
>>  > >  +    disable buffer
>>  > >  +
>>  > >  +    for block in blocks:
>>  > >  +      close block
>>  > >  diff --git a/Documentation/iio/index.rst
>>  > > b/Documentation/iio/index.rst
>>  > >  index 58b7a4ebac51..669deb67ddee 100644
>>  > >  --- a/Documentation/iio/index.rst
>>  > >  +++ b/Documentation/iio/index.rst
>>  > >  @@ -9,4 +9,6 @@ Industrial I/O
>>  > >
>>  > >      iio_configfs
>>  > >
>>  > >  +   dmabuf_api
>>  > >  +
>>  > >      ep93xx_adc
>>  > >  --
>>  > >  2.34.1
>>  > >
>>  >
>>  > --
>>  > Daniel Vetter
>>  > Software Engineer, Intel Corporation
>>  > http://blog.ffwll.ch
>> 
>> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
  2022-03-29 17:16           ` Paul Cercueil
@ 2022-03-30  9:19             ` Daniel Vetter
  0 siblings, 0 replies; 41+ messages in thread
From: Daniel Vetter @ 2022-03-30  9:19 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Daniel Vetter, Jonathan Cameron, Jonathan Lemon,
	Michael Hennerich, Jonathan Corbet, linux-iio, linux-doc,
	linux-kernel, dri-devel, Sumit Semwal, linaro-mm-sig,
	Alexandru Ardelean, Christian König

On Tue, Mar 29, 2022 at 06:16:56PM +0100, Paul Cercueil wrote:
> Hi Daniel,
> 
> Le mar., mars 29 2022 at 16:10:44 +0200, Daniel Vetter <daniel@ffwll.ch> a
> écrit :
> > On Tue, Mar 29, 2022 at 10:11:14AM +0100, Paul Cercueil wrote:
> > >  Hi Daniel,
> > > 
> > >  Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter
> > > <daniel@ffwll.ch> a
> > >  écrit :
> > >  > On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
> > >  > >  Hi Jonathan,
> > >  > >
> > >  > >  Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron
> > >  > >  <jic23@kernel.org> a écrit :
> > >  > >  > On Mon,  7 Feb 2022 12:59:21 +0000
> > >  > >  > Paul Cercueil <paul@crapouillou.net> wrote:
> > >  > >  >
> > >  > >  > >  Hi Jonathan,
> > >  > >  > >
> > >  > >  > >  This is the V2 of my patchset that introduces a new
> > > userspace
> > >  > >  > > interface
> > >  > >  > >  based on DMABUF objects to complement the fileio API, and
> > > adds
> > >  > >  > > write()
> > >  > >  > >  support to the existing fileio API.
> > >  > >  >
> > >  > >  > Hi Paul,
> > >  > >  >
> > >  > >  > It's been a little while. Perhaps you could summarize the
> > > various
> > >  > > view
> > >  > >  > points around the appropriateness of using DMABUF for this?
> > >  > >  > I appreciate it is a tricky topic to distil into a brief
> > > summary
> > >  > > but
> > >  > >  > I know I would find it useful even if no one else does!
> > >  > >
> > >  > >  So we want to have a high-speed interface where buffers of
> > > samples
> > >  > > are
> > >  > >  passed around between IIO devices and other devices (e.g. USB
> > > or
> > >  > > network),
> > >  > >  or made available to userspace without copying the data.
> > >  > >
> > >  > >  DMABUF is, at least in theory, exactly what we need. Quoting
> > > the
> > >  > >  documentation
> > >  > >
> > > (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
> > >  > >  "The dma-buf subsystem provides the framework for sharing
> > > buffers
> > >  > > for
> > >  > >  hardware (DMA) access across multiple device drivers and
> > >  > > subsystems, and for
> > >  > >  synchronizing asynchronous hardware access. This is used, for
> > >  > > example, by
> > >  > >  drm “prime” multi-GPU support, but is of course not limited to
> > > GPU
> > >  > > use
> > >  > >  cases."
> > >  > >
> > >  > >  The problem is that right now DMABUF is only really used by
> > > DRM,
> > >  > > and to
> > >  > >  quote Daniel, "dma-buf looks like something super generic and
> > >  > > useful, until
> > >  > >  you realize that there's a metric ton of gpu/accelerator bagage
> > >  > > piled in".
> > >  > >
> > >  > >  Still, it seems to be the only viable option. We could add a
> > > custom
> > >  > >  buffer-passing interface, but that would mean implementing the
> > > same
> > >  > >  buffer-passing interface on the network and USB stacks, and
> > > before
> > >  > > we know
> > >  > >  it we re-invented DMABUFs.
> > >  >
> > >  > dma-buf also doesn't support sharing with network and usb stacks,
> > > so I'm
> > >  > a
> > >  > bit confused why exactly this is useful?
> > > 
> > >  There is an attempt to get dma-buf support in the network stack,
> > > called
> > >  "zctap". Last patchset was sent last november. USB stack does not
> > > support
> > >  dma-buf, but we can add it later I guess.
> > > 
> > >  > So yeah unless there's some sharing going on with gpu stuff (for
> > > data
> > >  > processing maybe) I'm not sure this makes a lot of sense really.
> > > Or at
> > >  > least some zero-copy sharing between drivers, but even that would
> > >  > minimally require a dma-buf import ioctl of some sorts. Which I
> > > either
> > >  > missed or doesn't exist.
> > > 
> > >  We do want zero-copy between drivers, the network stack, and the
> > > USB stack.
> > >  It's not just about having a userspace interface.
> > 
> > I think in that case we need these other pieces too. And we need acks
> > from
> > relevant subsystems that these other pieces are a) ready for upstream
> > merging and also that the dma-buf side of things actually makes sense.
> 
> Ok...
> 
> > >  > If there's none of that then just hand-roll your buffer handling
> > > code
> > >  > (xarray is cheap to use in terms of code for this), you can
> > > always add
> > >  > dma-buf import/export later on when the need arises.
> > >  >
> > >  > Scrolling through patches you only have dma-buf export, but no
> > >  > importing,
> > >  > so the use-case that works is with one of the existing subsystems
> > > that
> > >  > supporting dma-buf importing.
> > >  >
> > >  > I think minimally we need the use-case (in form of code) that
> > > needs the
> > >  > buffer sharing here.
> > > 
> > >  I'll try with zctap and report back.
> > 
> > Do you have a link for this? I just checked dri-devel on lore, and it's
> > not there. Nor anywhere else.
> 
> The code is here: https://github.com/jlemon/zctap_kernel
> 
> I know Jonathan Lemon (Cc'd) was working on upstreaming it, I saw a few
> patchsets.

Yeah if the goal here is to zero-copy from iio to network sockets, then I
think we really need the full picture first, at least as a prototype.

And also a rough consensus among all involved subsystems that this is the
right approach and that there's no fundamental issues. I really have no
clue about network to make a call there.

I'm bringing this up because a few folks wanted to look into zero-copy
between gpu and nvme, using dma-buf. And after lots of
head-banging-against-solid-concrete-walls, at least my conclusion is that
due to locking issues it's really not possible without huge changes to the
block i/o. And those are not on the table.
-Daniel

> 
> Cheers,
> -Paul
> 
> > We really need all the pieces, and if block layer reaction is anything
> > to
> > judge by, dma-buf wont happen for networking either. There's some really
> > nasty and fairly fundamental issues with locking and memory reclaim that
> > make this utter pain or outright impossible.
> > -Daniel
> > 
> > > 
> > >  Cheers,
> > >  -Paul
> > > 
> > >  > >  > >
> > >  > >  > >  Changes since v1:
> > >  > >  > >
> > >  > >  > >  - the patches that were merged in v1 have been (obviously)
> > >  > > dropped
> > >  > >  > > from
> > >  > >  > >    this patchset;
> > >  > >  > >  - the patch that was setting the write-combine cache
> > > setting
> > >  > > has
> > >  > >  > > been
> > >  > >  > >    dropped as well, as it was simply not useful.
> > >  > >  > >  - [01/12]:
> > >  > >  > >      * Only remove the outgoing queue, and keep the
> > > incoming
> > >  > > queue,
> > >  > >  > > as we
> > >  > >  > >        want the buffer to start streaming data as soon as
> > > it is
> > >  > >  > > enabled.
> > >  > >  > >      * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now
> > >  > > functionally
> > >  > >  > > the
> > >  > >  > >        same as IIO_BLOCK_STATE_DONE.
> > >  > >  > >  - [02/12]:
> > >  > >  > >      * Fix block->state not being reset in
> > >  > >  > >        iio_dma_buffer_request_update() for output buffers.
> > >  > >  > >      * Only update block->bytes_used once and add a comment
> > >  > > about
> > >  > >  > > why we
> > >  > >  > >        update it.
> > >  > >  > >      * Add a comment about why we're setting a different
> > > state
> > >  > > for
> > >  > >  > > output
> > >  > >  > >        buffers in iio_dma_buffer_request_update()
> > >  > >  > >      * Remove useless cast to bool (!!) in
> > > iio_dma_buffer_io()
> > >  > >  > >  - [05/12]:
> > >  > >  > >      Only allow the new IOCTLs on the buffer FD created
> > > with
> > >  > >  > >      IIO_BUFFER_GET_FD_IOCTL().
> > >  > >  > >  - [12/12]:
> > >  > >  > >      * Explicitly state that the new interface is optional
> > > and
> > >  > > is
> > >  > >  > >        not implemented by all drivers.
> > >  > >  > >      * The IOCTLs can now only be called on the buffer FD
> > >  > > returned by
> > >  > >  > >        IIO_BUFFER_GET_FD_IOCTL.
> > >  > >  > >      * Move the page up a bit in the index since it is core
> > >  > > stuff
> > >  > >  > > and not
> > >  > >  > >        driver-specific.
> > >  > >  > >
> > >  > >  > >  The patches not listed here have not been modified since
> > > v1.
> > >  > >  > >
> > >  > >  > >  Cheers,
> > >  > >  > >  -Paul
> > >  > >  > >
> > >  > >  > >  Alexandru Ardelean (1):
> > >  > >  > >    iio: buffer-dma: split iio_dma_buffer_fileio_free()
> > > function
> > >  > >  > >
> > >  > >  > >  Paul Cercueil (11):
> > >  > >  > >    iio: buffer-dma: Get rid of outgoing queue
> > >  > >  > >    iio: buffer-dma: Enable buffer write support
> > >  > >  > >    iio: buffer-dmaengine: Support specifying buffer
> > > direction
> > >  > >  > >    iio: buffer-dmaengine: Enable write support
> > >  > >  > >    iio: core: Add new DMABUF interface infrastructure
> > >  > >  > >    iio: buffer-dma: Use DMABUFs instead of custom solution
> > >  > >  > >    iio: buffer-dma: Implement new DMABUF based userspace
> > > API
> > >  > >  > >    iio: buffer-dmaengine: Support new DMABUF based
> > > userspace API
> > >  > >  > >    iio: core: Add support for cyclic buffers
> > >  > >  > >    iio: buffer-dmaengine: Add support for cyclic buffers
> > >  > >  > >    Documentation: iio: Document high-speed DMABUF based API
> > >  > >  > >
> > >  > >  > >   Documentation/driver-api/dma-buf.rst          |   2 +
> > >  > >  > >   Documentation/iio/dmabuf_api.rst              |  94 +++
> > >  > >  > >   Documentation/iio/index.rst                   |   2 +
> > >  > >  > >   drivers/iio/adc/adi-axi-adc.c                 |   3 +-
> > >  > >  > >   drivers/iio/buffer/industrialio-buffer-dma.c  | 610
> > >  > >  > > ++++++++++++++----
> > >  > >  > >   .../buffer/industrialio-buffer-dmaengine.c    |  42 +-
> > >  > >  > >   drivers/iio/industrialio-buffer.c             |  60 ++
> > >  > >  > >   include/linux/iio/buffer-dma.h                |  38 +-
> > >  > >  > >   include/linux/iio/buffer-dmaengine.h          |   5 +-
> > >  > >  > >   include/linux/iio/buffer_impl.h               |   8 +
> > >  > >  > >   include/uapi/linux/iio/buffer.h               |  30 +
> > >  > >  > >   11 files changed, 749 insertions(+), 145 deletions(-)
> > >  > >  > >   create mode 100644 Documentation/iio/dmabuf_api.rst
> > >  > >  > >
> > >  > >  >
> > >  > >
> > >  > >
> > >  >
> > >  > --
> > >  > Daniel Vetter
> > >  > Software Engineer, Intel Corporation
> > >  > http://blog.ffwll.ch
> > > 
> > > 
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API
  2022-03-29 17:34           ` Paul Cercueil
@ 2022-03-30  9:22             ` Daniel Vetter
  0 siblings, 0 replies; 41+ messages in thread
From: Daniel Vetter @ 2022-03-30  9:22 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Daniel Vetter, Jonathan Cameron, Jonathan Lemon,
	Michael Hennerich, Jonathan Corbet, linux-iio, linux-doc,
	linux-kernel, dri-devel, linaro-mm-sig, Alexandru Ardelean,
	Christian König

On Tue, Mar 29, 2022 at 06:34:58PM +0100, Paul Cercueil wrote:
> 
> 
> Le mar., mars 29 2022 at 16:07:21 +0200, Daniel Vetter <daniel@ffwll.ch> a
> écrit :
> > On Tue, Mar 29, 2022 at 10:47:23AM +0100, Paul Cercueil wrote:
> > >  Hi Daniel,
> > > 
> > >  Le mar., mars 29 2022 at 10:54:43 +0200, Daniel Vetter
> > > <daniel@ffwll.ch> a
> > >  écrit :
> > >  > On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
> > >  > >  Document the new DMABUF based API.
> > >  > >
> > >  > >  v2: - Explicitly state that the new interface is optional and
> > > is
> > >  > >        not implemented by all drivers.
> > >  > >      - The IOCTLs can now only be called on the buffer FD
> > > returned by
> > >  > >        IIO_BUFFER_GET_FD_IOCTL.
> > >  > >      - Move the page up a bit in the index since it is core
> > > stuff
> > >  > > and not
> > >  > >        driver-specific.
> > >  > >
> > >  > >  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > >  > >  ---
> > >  > >   Documentation/driver-api/dma-buf.rst |  2 +
> > >  > >   Documentation/iio/dmabuf_api.rst     | 94
> > >  > > ++++++++++++++++++++++++++++
> > >  > >   Documentation/iio/index.rst          |  2 +
> > >  > >   3 files changed, 98 insertions(+)
> > >  > >   create mode 100644 Documentation/iio/dmabuf_api.rst
> > >  > >
> > >  > >  diff --git a/Documentation/driver-api/dma-buf.rst
> > >  > > b/Documentation/driver-api/dma-buf.rst
> > >  > >  index 2cd7db82d9fe..d3c9b58d2706 100644
> > >  > >  --- a/Documentation/driver-api/dma-buf.rst
> > >  > >  +++ b/Documentation/driver-api/dma-buf.rst
> > >  > >  @@ -1,3 +1,5 @@
> > >  > >  +.. _dma-buf:
> > >  > >  +
> > >  > >   Buffer Sharing and Synchronization
> > >  > >   ==================================
> > >  > >
> > >  > >  diff --git a/Documentation/iio/dmabuf_api.rst
> > >  > > b/Documentation/iio/dmabuf_api.rst
> > >  > >  new file mode 100644
> > >  > >  index 000000000000..43bb2c1b9fdc
> > >  > >  --- /dev/null
> > >  > >  +++ b/Documentation/iio/dmabuf_api.rst
> > >  > >  @@ -0,0 +1,94 @@
> > >  > >  +===================================
> > >  > >  +High-speed DMABUF interface for IIO
> > >  > >  +===================================
> > >  > >  +
> > >  > >  +1. Overview
> > >  > >  +===========
> > >  > >  +
> > >  > >  +The Industrial I/O subsystem supports access to buffers
> > > through a
> > >  > > file-based
> > >  > >  +interface, with read() and write() access calls through the
> > > IIO
> > >  > > device's dev
> > >  > >  +node.
> > >  > >  +
> > >  > >  +It additionally supports a DMABUF based interface, where the
> > >  > > userspace
> > >  > >  +application can allocate and append DMABUF objects to the
> > > buffer's
> > >  > > queue.
> > >  > >  +This interface is however optional and is not available in all
> > >  > > drivers.
> > >  > >  +
> > >  > >  +The advantage of this DMABUF based interface vs. the read()
> > >  > >  +interface, is that it avoids an extra copy of the data
> > > between the
> > >  > >  +kernel and userspace. This is particularly useful for
> > > high-speed
> > >  > >  +devices which produce several megabytes or even gigabytes of
> > > data
> > >  > > per
> > >  > >  +second.
> > >  > >  +
> > >  > >  +The data in this DMABUF interface is managed at the
> > > granularity of
> > >  > >  +DMABUF objects. Reducing the granularity from byte level to
> > > block
> > >  > > level
> > >  > >  +is done to reduce the userspace-kernelspace synchronization
> > >  > > overhead
> > >  > >  +since performing syscalls for each byte at a few Mbps is just
> > > not
> > >  > >  +feasible.
> > >  > >  +
> > >  > >  +This of course leads to a slightly increased latency. For this
> > >  > > reason an
> > >  > >  +application can choose the size of the DMABUFs as well as how
> > > many
> > >  > > it
> > >  > >  +allocates. E.g. two DMABUFs would be a traditional double
> > > buffering
> > >  > >  +scheme. But using a higher number might be necessary to avoid
> > >  > >  +underflow/overflow situations in the presence of scheduling
> > >  > > latencies.
> > >  >
> > >  > So this reads a lot like reinventing io-uring with pre-registered
> > >  > O_DIRECT
> > >  > memory ranges. Except it's using dma-buf and hand-rolling a lot of
> > >  > pieces
> > >  > instead of io-uring and O_DIRECT.
> > > 
> > >  I don't see how io_uring would help us. It's an async I/O
> > > framework, does it
> > >  allow us to access a kernel buffer without copying the data? Does
> > > it allow
> > >  us to zero-copy the data to a network interface?
> > 
> > With networking, do you mean rdma, or some other kind of networking?
> > Anything else than rdma doesn't support dma-buf, and I don't think it
> > will
> > likely ever do so. Similar it's really tricky to glue dma-buf support
> > into
> > the block layer.
> 
> By networking I mean standard sockets. If I'm not mistaken, Jonathan Lemon's
> work on zctap was to add dma-buf import/export support to standard sockets.
> 
> > Wrt io_uring, yes it's async, but that's not the point. The point is
> > that
> > with io_uring you pre-register ranges for reads and writes to target,
> > which in combination with O_DIRECT, makes it effectively (and
> > efficient!)
> > zero-copy. Plus it has full integration with both networking and normal
> > file io, which dma-buf just doesn't have.
> > 
> > Like you _cannot_ do zero copy from a dma-buf into a normal file. You
> > absolutely can do the same with io_uring.
> 
> I believe io_uring does zero-copy the same way as splice(), by
> duplicating/moving pages? Because that wouldn't work with DMA coherent
> memory, which is contiguous and not backed by pages.

Yeah if your memory has to be contig and/or write-combined/uncached for
dma reasons, then we're much more firmly into dma-buf territory. But also
that means we really need dma-buf support in the networking stack, and
that might be a supreme challenge.

E.g. dma-buf is all about pre-registering memory (dma_buf_attach is
potentilly very expensive) for a specific device. With full generality
networking, none of this is possible since until you make the dynamic
decision to send stuff out, you might not even know the device the packets
go out on.

Also with filtering and everything cpu access is pretty much assumed, and
doing that with dma-buf is a bit a challenge.
-Daniel

> 
> > >  > At least if the entire justification for dma-buf support is
> > > zero-copy
> > >  > support between the driver and userspace it's _really_ not the
> > > right
> > >  > tool
> > >  > for the job. dma-buf is for zero-copy between devices, with cpu
> > > access
> > >  > from userpace (or kernel fwiw) being very much the exception (and
> > > often
> > >  > flat-out not supported at all).
> > > 
> > >  We want both. Using dma-bufs for the driver/userspace interface is a
> > >  convenience as we then have a unique API instead of two distinct
> > > ones.
> > > 
> > >  Why should CPU access from userspace be the exception? It works
> > > fine for IIO
> > >  dma-bufs. You keep warning about this being a terrible design, but
> > > I simply
> > >  don't see it.
> > 
> > It depends really on what you're trying to do, and there's extremely
> > high
> > chances it will simply not work.
> 
> Well it does work though. The userspace interface is stupidly simple here -
> one dma-buf, backed by DMA coherent memory, is enqueued for processing by
> the DMA. The userspace calling the "sync" ioctl on the dma-buf will block
> until the transfer is complete, and then userspace can access it again.
> 
> 
> > Unless you want to do zero copy with a gpu, or something which is in
> > that
> > ecosystem of accelerators and devices, then dma-buf is probably not what
> > you're looking for.
> > -Daniel
> 
> I want to do zero-copy between a IIO device and the network/USB, and right
> now there is absolutely nothing in place that allows me to do that. So I
> have to get creative.
> 
> Cheers,
> -Paul
> 
> > > 
> > >  > >  +
> > >  > >  +2. User API
> > >  > >  +===========
> > >  > >  +
> > >  > >  +``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req
> > > *)``
> > >  > >
> > > +----------------------------------------------------------------
> > >  > >  +
> > >  > >  +Each call will allocate a new DMABUF object. The return value
> > > (if
> > >  > > not
> > >  > >  +a negative errno value as error) will be the file descriptor
> > > of
> > >  > > the new
> > >  > >  +DMABUF.
> > >  > >  +
> > >  > >  +``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)``
> > >  > >  +--------------------------------------------------------
> > >  > >  +
> > >  > >  +Place the DMABUF object into the queue pending for hardware
> > >  > > process.
> > >  > >  +
> > >  > >  +These two IOCTLs have to be performed on the IIO buffer's file
> > >  > >  +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL`
> > > ioctl.
> > >  > >  +
> > >  > >  +3. Usage
> > >  > >  +========
> > >  > >  +
> > >  > >  +To access the data stored in a block by userspace the block
> > > must be
> > >  > >  +mapped to the process's memory. This is done by calling
> > > mmap() on
> > >  > > the
> > >  > >  +DMABUF's file descriptor.
> > >  > >  +
> > >  > >  +Before accessing the data through the map, you must use the
> > >  > >  +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
> > >  > >  +DMA_BUF_SYNC_START flag, to make sure that the data is
> > > available.
> > >  > >  +This call may block until the hardware is done with this
> > > block.
> > >  > > Once
> > >  > >  +you are done reading or writing the data, you must use this
> > > ioctl
> > >  > > again
> > >  > >  +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF
> > > to the
> > >  > >  +kernel's queue.
> > >  > >  +
> > >  > >  +If you need to know when the hardware is done with a DMABUF,
> > > you
> > >  > > can
> > >  > >  +poll its file descriptor for the EPOLLOUT event.
> > >  > >  +
> > >  > >  +Finally, to destroy a DMABUF object, simply call close() on
> > > its
> > >  > > file
> > >  > >  +descriptor.
> > >  > >  +
> > >  > >  +For more information about manipulating DMABUF objects, see:
> > >  > > :ref:`dma-buf`.
> > >  > >  +
> > >  > >  +A typical workflow for the new interface is:
> > >  > >  +
> > >  > >  +    for block in blocks:
> > >  > >  +      DMABUF_ALLOC block
> > >  > >  +      mmap block
> > >  > >  +
> > >  > >  +    enable buffer
> > >  > >  +
> > >  > >  +    while !done
> > >  > >  +      for block in blocks:
> > >  > >  +        DMABUF_ENQUEUE block
> > >  > >  +
> > >  > >  +        DMABUF_SYNC_START block
> > >  > >  +        process data
> > >  > >  +        DMABUF_SYNC_END block
> > >  > >  +
> > >  > >  +    disable buffer
> > >  > >  +
> > >  > >  +    for block in blocks:
> > >  > >  +      close block
> > >  > >  diff --git a/Documentation/iio/index.rst
> > >  > > b/Documentation/iio/index.rst
> > >  > >  index 58b7a4ebac51..669deb67ddee 100644
> > >  > >  --- a/Documentation/iio/index.rst
> > >  > >  +++ b/Documentation/iio/index.rst
> > >  > >  @@ -9,4 +9,6 @@ Industrial I/O
> > >  > >
> > >  > >      iio_configfs
> > >  > >
> > >  > >  +   dmabuf_api
> > >  > >  +
> > >  > >      ep93xx_adc
> > >  > >  --
> > >  > >  2.34.1
> > >  > >
> > >  >
> > >  > --
> > >  > Daniel Vetter
> > >  > Software Engineer, Intel Corporation
> > >  > http://blog.ffwll.ch
> > > 
> > > 
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2022-03-30  9:22 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-07 12:59 [PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API Paul Cercueil
2022-02-07 12:59 ` [PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue Paul Cercueil
2022-02-13 18:57   ` Jonathan Cameron
2022-02-13 19:25     ` Paul Cercueil
2022-03-28 17:17   ` Jonathan Cameron
2022-02-07 12:59 ` [PATCH v2 02/12] iio: buffer-dma: Enable buffer write support Paul Cercueil
2022-03-28 17:24   ` Jonathan Cameron
2022-03-28 18:39     ` Paul Cercueil
2022-03-29  7:11       ` Nuno Sá
2022-03-28 20:38   ` Andy Shevchenko
2022-02-07 12:59 ` [PATCH v2 03/12] iio: buffer-dmaengine: Support specifying buffer direction Paul Cercueil
2022-02-07 12:59 ` [PATCH v2 04/12] iio: buffer-dmaengine: Enable write support Paul Cercueil
2022-03-28 17:28   ` Jonathan Cameron
2022-02-07 12:59 ` [PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure Paul Cercueil
2022-03-28 17:37   ` Jonathan Cameron
2022-03-28 18:44     ` Paul Cercueil
2022-03-29 13:36       ` Jonathan Cameron
2022-03-28 20:46   ` Andy Shevchenko
2022-02-07 12:59 ` [PATCH v2 06/12] iio: buffer-dma: split iio_dma_buffer_fileio_free() function Paul Cercueil
2022-02-07 12:59 ` [PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution Paul Cercueil
2022-03-28 17:54   ` Jonathan Cameron
2022-03-28 17:54     ` Christian König
2022-03-28 19:16     ` Paul Cercueil
2022-03-28 20:58       ` Andy Shevchenko
2022-02-07 12:59 ` [PATCH v2 08/12] iio: buffer-dma: Implement new DMABUF based userspace API Paul Cercueil
2022-02-07 12:59 ` [PATCH v2 09/12] iio: buffer-dmaengine: Support " Paul Cercueil
2022-02-07 12:59 ` [PATCH v2 10/12] iio: core: Add support for cyclic buffers Paul Cercueil
2022-02-07 13:01 ` [PATCH v2 11/12] iio: buffer-dmaengine: " Paul Cercueil
2022-02-07 13:01   ` [PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API Paul Cercueil
2022-03-29  8:54     ` Daniel Vetter
2022-03-29  9:47       ` Paul Cercueil
2022-03-29 14:07         ` Daniel Vetter
2022-03-29 17:34           ` Paul Cercueil
2022-03-30  9:22             ` Daniel Vetter
2022-02-13 18:46 ` [PATCH v2 00/12] iio: buffer-dma: write() and new " Jonathan Cameron
2022-02-15 17:43   ` Paul Cercueil
2022-03-29  8:33     ` Daniel Vetter
2022-03-29  9:11       ` Paul Cercueil
2022-03-29 14:10         ` Daniel Vetter
2022-03-29 17:16           ` Paul Cercueil
2022-03-30  9:19             ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).