linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/6] dma-buf: Performance improvements for system heap
@ 2020-09-26  4:24 John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 1/6] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists John Stultz
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: John Stultz @ 2020-09-26  4:24 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Sumit Semwal, Liam Mark, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Suren Baghdasaryan,
	Sandeep Patil, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

Hey All,
  So this patch series contains a series of performance
optimizations to the dma-buf system heap.

Unfortunately, in working these up, I realized the heap-helpers
infrastructure we tried to add to miniimize code duplication is
not as generic as we intended. For some heaps it makes sense to
deal with page lists, for other heaps it makes more sense to
track things with sgtables.

So this series reworks the system heap to use sgtables, and then
consolidates the pagelist method from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked.

After this, the series introduces two optimizations to the the
system heap, utilizing large order pages, and adding a page-pool
(maybe abusing the pagepool logic from the network code, but it
seems silly to reimplement it).


I implemented a simple allocation microbenchmark to compare
dmabuf heaps vs ion:
  https://git.linaro.org/people/john.stultz/android-dev.git/commit/?h=dev/dma-buf-heap-perf&id=e33aabd34b300f8f8be8d71ec7253dd0abe702f2

With these changes, the allocation path is *much* improved,
performing better then ION (though to be fair, the repeated 
allocating and freeing of the same size buffer is the ideal
case for the pagepool logic, so don't read too much into it).

I charted some datapoints from the microbenchmark with each
of the patches should folks be interested.
https://docs.google.com/spreadsheets/d/1-1C8ZQpmkl_0DISkI6z4xelE08MlNAN7oEu34AnO4Ao/edit#gid=0

Finally, a port of a patch that Ørjan Eide implemented for ION
that avoids calling sync on attachments that don't have a
mapping.

Feedback on these would be great!


Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org

John Stultz (6):
  dma-buf: system_heap: Rework system heap to use sgtables instead of
    pagelists
  dma-buf: heaps: Move heap-helper logic into the cma_heap
    implementation
  dma-buf: heaps: Remove heap-helpers code
  dma-buf: system_heap: Allocate higher order pages if available
  dma-buf: system_heap: Add pagepool support to system heap
  dma-buf: heaps: Skip sync if not mapped

 drivers/dma-buf/heaps/Kconfig        |   1 +
 drivers/dma-buf/heaps/Makefile       |   1 -
 drivers/dma-buf/heaps/cma_heap.c     | 332 +++++++++++++++++----
 drivers/dma-buf/heaps/heap-helpers.c | 271 -----------------
 drivers/dma-buf/heaps/heap-helpers.h |  53 ----
 drivers/dma-buf/heaps/system_heap.c  | 426 ++++++++++++++++++++++++---
 6 files changed, 660 insertions(+), 424 deletions(-)
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH 1/6] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists
  2020-09-26  4:24 [RFC][PATCH 0/6] dma-buf: Performance improvements for system heap John Stultz
@ 2020-09-26  4:24 ` John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 2/6] dma-buf: heaps: Move heap-helper logic into the cma_heap implementation John Stultz
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2020-09-26  4:24 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Sumit Semwal, Liam Mark, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Suren Baghdasaryan,
	Sandeep Patil, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

In preparation for some patches to optmize the system
heap code, rework the dmabuf exporter to utilize sgtables rather
then pageslists for tracking the associated pages.

This will allow for large order page allocations, as well as
more efficient page pooling.

In doing so, the system heap stops using the heap-helpers logic
which sadly is not quite as generic as I was hoping it to be, so
this patch adds heap specific implementations of the dma_buf_ops
function handlers.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/dma-buf/heaps/system_heap.c | 343 ++++++++++++++++++++++++----
 1 file changed, 297 insertions(+), 46 deletions(-)

diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 0bf688e3c023..ddfa17dc48a8 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -3,7 +3,11 @@
  * DMABUF System heap exporter
  *
  * Copyright (C) 2011 Google, Inc.
- * Copyright (C) 2019 Linaro Ltd.
+ * Copyright (C) 2019, 2020 Linaro Ltd.
+ *
+ * Portions based off of Andrew Davis' SRAM heap:
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com/
+ *	Andrew F. Davis <afd@ti.com>
  */
 
 #include <linux/dma-buf.h>
@@ -15,72 +19,320 @@
 #include <linux/module.h>
 #include <linux/scatterlist.h>
 #include <linux/slab.h>
-#include <linux/sched/signal.h>
-#include <asm/page.h>
-
-#include "heap-helpers.h"
+#include <linux/vmalloc.h>
 
 struct dma_heap *sys_heap;
 
-static void system_heap_free(struct heap_helper_buffer *buffer)
+struct system_heap_buffer {
+	struct dma_heap *heap;
+	struct list_head attachments;
+	struct mutex lock;
+	unsigned long len;
+	struct sg_table sg_table;
+	int vmap_cnt;
+	void *vaddr;
+};
+
+struct dma_heap_attachment {
+	struct device *dev;
+	struct sg_table *table;
+	struct list_head list;
+};
+
+static struct sg_table *dup_sg_table(struct sg_table *table)
 {
-	pgoff_t pg;
+	struct sg_table *new_table;
+	int ret, i;
+	struct scatterlist *sg, *new_sg;
+
+	new_table = kzalloc(sizeof(*new_table), GFP_KERNEL);
+	if (!new_table)
+		return ERR_PTR(-ENOMEM);
+
+	ret = sg_alloc_table(new_table, table->nents, GFP_KERNEL);
+	if (ret) {
+		kfree(new_table);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	new_sg = new_table->sgl;
+	for_each_sgtable_sg(table, sg, i) {
+		sg_set_page(new_sg, sg_page(sg), sg->length, sg->offset);
+		new_sg = sg_next(new_sg);
+	}
+
+	return new_table;
+}
+
+static int system_heap_attach(struct dma_buf *dmabuf,
+			      struct dma_buf_attachment *attachment)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+	struct dma_heap_attachment *a;
+	struct sg_table *table;
+
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a)
+		return -ENOMEM;
+
+	table = dup_sg_table(&buffer->sg_table);
+	if (IS_ERR(table)) {
+		kfree(a);
+		return -ENOMEM;
+	}
+
+	a->table = table;
+	a->dev = attachment->dev;
+	INIT_LIST_HEAD(&a->list);
+
+	attachment->priv = a;
+
+	mutex_lock(&buffer->lock);
+	list_add(&a->list, &buffer->attachments);
+	mutex_unlock(&buffer->lock);
+
+	return 0;
+}
+
+static void system_heap_detatch(struct dma_buf *dmabuf,
+				struct dma_buf_attachment *attachment)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+	struct dma_heap_attachment *a = attachment->priv;
+
+	mutex_lock(&buffer->lock);
+	list_del(&a->list);
+	mutex_unlock(&buffer->lock);
+
+	sg_free_table(a->table);
+	kfree(a->table);
+	kfree(a);
+}
+
+static struct sg_table *system_heap_map_dma_buf(struct dma_buf_attachment *attachment,
+						enum dma_data_direction direction)
+{
+	struct dma_heap_attachment *a = attachment->priv;
+	struct sg_table *table = a->table;
+
+	if (!dma_map_sg(attachment->dev, table->sgl, table->nents, direction))
+		return ERR_PTR(-ENOMEM);
+
+	return table;
+}
+
+static void system_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
+				      struct sg_table *table,
+				      enum dma_data_direction direction)
+{
+	dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
+}
+
+static int system_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
+						enum dma_data_direction direction)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+	struct dma_heap_attachment *a;
+	int ret = 0;
+
+	mutex_lock(&buffer->lock);
+
+	if (buffer->vmap_cnt)
+		invalidate_kernel_vmap_range(buffer->vaddr, buffer->len);
+
+	list_for_each_entry(a, &buffer->attachments, list) {
+		dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
+				    direction);
+	}
+	mutex_unlock(&buffer->lock);
+
+	return ret;
+}
+
+static int system_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
+					      enum dma_data_direction direction)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+	struct dma_heap_attachment *a;
+
+	mutex_lock(&buffer->lock);
+
+	if (buffer->vmap_cnt)
+		flush_kernel_vmap_range(buffer->vaddr, buffer->len);
+
+	list_for_each_entry(a, &buffer->attachments, list) {
+		dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
+				       direction);
+	}
+	mutex_unlock(&buffer->lock);
+
+	return 0;
+}
+
+static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+	struct sg_table *table = &buffer->sg_table;
+	unsigned long addr = vma->vm_start;
+	struct sg_page_iter piter;
+	int ret;
+
+	for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
+		struct page *page = sg_page_iter_page(&piter);
+
+		ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
+				      vma->vm_page_prot);
+		if (ret)
+			return ret;
+		addr += PAGE_SIZE;
+		if (addr >= vma->vm_end)
+			return 0;
+	}
+	return 0;
+}
+
+static void *system_heap_do_vmap(struct system_heap_buffer *buffer)
+{
+	struct sg_table *table = &buffer->sg_table;
+	int npages = PAGE_ALIGN(buffer->len) / PAGE_SIZE;
+	struct page **pages = vmalloc(sizeof(struct page *) * npages);
+	struct page **tmp = pages;
+	struct sg_page_iter piter;
+	void *vaddr;
+
+	if (!pages)
+		return ERR_PTR(-ENOMEM);
+
+	for_each_sgtable_page(table, &piter, 0) {
+		WARN_ON(tmp - pages >= npages);
+		*tmp++ = sg_page_iter_page(&piter);
+	}
+
+	vaddr = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
+	vfree(pages);
+
+	if (!vaddr)
+		return ERR_PTR(-ENOMEM);
+
+	return vaddr;
+}
 
-	for (pg = 0; pg < buffer->pagecount; pg++)
-		__free_page(buffer->pages[pg]);
-	kfree(buffer->pages);
+static void *system_heap_vmap(struct dma_buf *dmabuf)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+	void *vaddr;
+
+	mutex_lock(&buffer->lock);
+	if (buffer->vmap_cnt) {
+		buffer->vmap_cnt++;
+		return buffer->vaddr;
+	}
+
+	vaddr = system_heap_do_vmap(buffer);
+	if (IS_ERR(vaddr))
+		return vaddr;
+
+	buffer->vaddr = vaddr;
+	buffer->vmap_cnt++;
+	mutex_unlock(&buffer->lock);
+
+	return vaddr;
+}
+
+static void system_heap_vunmap(struct dma_buf *dmabuf, void *vaddr)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+
+	mutex_lock(&buffer->lock);
+	if (!--buffer->vmap_cnt) {
+		vunmap(buffer->vaddr);
+		buffer->vaddr = NULL;
+	}
+	mutex_unlock(&buffer->lock);
+}
+
+static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
+{
+	struct system_heap_buffer *buffer = dmabuf->priv;
+	struct sg_table *table;
+	struct scatterlist *sg;
+	int i;
+
+	table = &buffer->sg_table;
+	for_each_sgtable_sg(table, sg, i)
+		__free_page(sg_page(sg));
+	sg_free_table(table);
 	kfree(buffer);
 }
 
+const struct dma_buf_ops system_heap_buf_ops = {
+	.attach = system_heap_attach,
+	.detach = system_heap_detatch,
+	.map_dma_buf = system_heap_map_dma_buf,
+	.unmap_dma_buf = system_heap_unmap_dma_buf,
+	.begin_cpu_access = system_heap_dma_buf_begin_cpu_access,
+	.end_cpu_access = system_heap_dma_buf_end_cpu_access,
+	.mmap = system_heap_mmap,
+	.vmap = system_heap_vmap,
+	.vunmap = system_heap_vunmap,
+	.release = system_heap_dma_buf_release,
+};
+
 static int system_heap_allocate(struct dma_heap *heap,
 				unsigned long len,
 				unsigned long fd_flags,
 				unsigned long heap_flags)
 {
-	struct heap_helper_buffer *helper_buffer;
+	struct system_heap_buffer *buffer;
+	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
 	struct dma_buf *dmabuf;
-	int ret = -ENOMEM;
+	struct sg_table *table;
+	struct scatterlist *sg;
+	pgoff_t pagecount;
 	pgoff_t pg;
+	int i, ret = -ENOMEM;
 
-	helper_buffer = kzalloc(sizeof(*helper_buffer), GFP_KERNEL);
-	if (!helper_buffer)
+	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
+	if (!buffer)
 		return -ENOMEM;
 
-	init_heap_helper_buffer(helper_buffer, system_heap_free);
-	helper_buffer->heap = heap;
-	helper_buffer->size = len;
-
-	helper_buffer->pagecount = len / PAGE_SIZE;
-	helper_buffer->pages = kmalloc_array(helper_buffer->pagecount,
-					     sizeof(*helper_buffer->pages),
-					     GFP_KERNEL);
-	if (!helper_buffer->pages) {
-		ret = -ENOMEM;
-		goto err0;
-	}
+	INIT_LIST_HEAD(&buffer->attachments);
+	mutex_init(&buffer->lock);
+	buffer->heap = heap;
+	buffer->len = len;
 
-	for (pg = 0; pg < helper_buffer->pagecount; pg++) {
+	table = &buffer->sg_table;
+	pagecount = len / PAGE_SIZE;
+	if (sg_alloc_table(table, pagecount, GFP_KERNEL))
+		goto free_buffer;
+
+	sg = table->sgl;
+	for (pg = 0; pg < pagecount; pg++) {
+		struct page *page;
 		/*
 		 * Avoid trying to allocate memory if the process
-		 * has been killed by by SIGKILL
+		 * has been killed by SIGKILL
 		 */
 		if (fatal_signal_pending(current))
-			goto err1;
-
-		helper_buffer->pages[pg] = alloc_page(GFP_KERNEL | __GFP_ZERO);
-		if (!helper_buffer->pages[pg])
-			goto err1;
+			goto free_pages;
+		page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+		if (!page)
+			goto free_pages;
+		sg_set_page(sg, page, page_size(page), 0);
+		sg = sg_next(sg);
 	}
 
 	/* create the dmabuf */
-	dmabuf = heap_helper_export_dmabuf(helper_buffer, fd_flags);
+	exp_info.ops = &system_heap_buf_ops;
+	exp_info.size = buffer->len;
+	exp_info.flags = fd_flags;
+	exp_info.priv = buffer;
+	dmabuf = dma_buf_export(&exp_info);
 	if (IS_ERR(dmabuf)) {
 		ret = PTR_ERR(dmabuf);
-		goto err1;
+		goto free_pages;
 	}
 
-	helper_buffer->dmabuf = dmabuf;
-
 	ret = dma_buf_fd(dmabuf, fd_flags);
 	if (ret < 0) {
 		dma_buf_put(dmabuf);
@@ -90,12 +342,12 @@ static int system_heap_allocate(struct dma_heap *heap,
 
 	return ret;
 
-err1:
-	while (pg > 0)
-		__free_page(helper_buffer->pages[--pg]);
-	kfree(helper_buffer->pages);
-err0:
-	kfree(helper_buffer);
+free_pages:
+	for_each_sgtable_sg(table, sg, i)
+		__free_page(sg_page(sg));
+	sg_free_table(table);
+free_buffer:
+	kfree(buffer);
 
 	return ret;
 }
@@ -107,7 +359,6 @@ static const struct dma_heap_ops system_heap_ops = {
 static int system_heap_create(void)
 {
 	struct dma_heap_export_info exp_info;
-	int ret = 0;
 
 	exp_info.name = "system";
 	exp_info.ops = &system_heap_ops;
@@ -115,9 +366,9 @@ static int system_heap_create(void)
 
 	sys_heap = dma_heap_add(&exp_info);
 	if (IS_ERR(sys_heap))
-		ret = PTR_ERR(sys_heap);
+		return PTR_ERR(sys_heap);
 
-	return ret;
+	return 0;
 }
 module_init(system_heap_create);
 MODULE_LICENSE("GPL v2");
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC][PATCH 2/6] dma-buf: heaps: Move heap-helper logic into the cma_heap implementation
  2020-09-26  4:24 [RFC][PATCH 0/6] dma-buf: Performance improvements for system heap John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 1/6] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists John Stultz
@ 2020-09-26  4:24 ` John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 3/6] dma-buf: heaps: Remove heap-helpers code John Stultz
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2020-09-26  4:24 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Sumit Semwal, Liam Mark, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Suren Baghdasaryan,
	Sandeep Patil, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

Since the heap-helpers logic ended up not being as generic as
hoped, move the heap-helpers dma_buf_ops implementations into
the cma_heap directly.

This will allow us to remove the heap_helpers code in a following
patch.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/dma-buf/heaps/cma_heap.c | 322 ++++++++++++++++++++++++++-----
 1 file changed, 270 insertions(+), 52 deletions(-)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 626cf7fd033a..3adfdbed0829 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -2,76 +2,291 @@
 /*
  * DMABUF CMA heap exporter
  *
- * Copyright (C) 2012, 2019 Linaro Ltd.
+ * Copyright (C) 2012, 2019, 2020 Linaro Ltd.
  * Author: <benjamin.gaignard@linaro.org> for ST-Ericsson.
+ *
+ * Also utilizing parts of Andrew Davis' SRAM heap:
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com/
+ *	Andrew F. Davis <afd@ti.com>
  */
-
 #include <linux/cma.h>
-#include <linux/device.h>
 #include <linux/dma-buf.h>
-#include <linux/dma-heap.h>
 #include <linux/dma-contiguous.h>
+#include <linux/dma-heap.h>
+#include <linux/dma-mapping.h>
 #include <linux/err.h>
-#include <linux/errno.h>
 #include <linux/highmem.h>
+#include <linux/io.h>
+#include <linux/mm.h>
 #include <linux/module.h>
-#include <linux/slab.h>
 #include <linux/scatterlist.h>
-#include <linux/sched/signal.h>
+#include <linux/slab.h>
 
-#include "heap-helpers.h"
 
 struct cma_heap {
 	struct dma_heap *heap;
 	struct cma *cma;
 };
 
-static void cma_heap_free(struct heap_helper_buffer *buffer)
+struct cma_heap_buffer {
+	struct cma_heap *heap;
+	struct list_head attachments;
+	struct mutex lock;
+	unsigned long len;
+	struct page *cma_pages;
+	struct page **pages;
+	pgoff_t pagecount;
+	int vmap_cnt;
+	void *vaddr;
+};
+
+struct dma_heap_attachment {
+	struct device *dev;
+	struct sg_table table;
+	struct list_head list;
+};
+
+static int cma_heap_attach(struct dma_buf *dmabuf,
+			   struct dma_buf_attachment *attachment)
 {
-	struct cma_heap *cma_heap = dma_heap_get_drvdata(buffer->heap);
-	unsigned long nr_pages = buffer->pagecount;
-	struct page *cma_pages = buffer->priv_virt;
+	struct cma_heap_buffer *buffer = dmabuf->priv;
+	struct dma_heap_attachment *a;
+	int ret;
 
-	/* free page list */
-	kfree(buffer->pages);
-	/* release memory */
-	cma_release(cma_heap->cma, cma_pages, nr_pages);
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a)
+		return -ENOMEM;
+
+	ret = sg_alloc_table_from_pages(&a->table, buffer->pages,
+					buffer->pagecount, 0,
+					buffer->pagecount << PAGE_SHIFT,
+					GFP_KERNEL);
+	if (ret) {
+		kfree(a);
+		return ret;
+	}
+
+	a->dev = attachment->dev;
+	INIT_LIST_HEAD(&a->list);
+
+	attachment->priv = a;
+
+	mutex_lock(&buffer->lock);
+	list_add(&a->list, &buffer->attachments);
+	mutex_unlock(&buffer->lock);
+
+	return 0;
+}
+
+static void cma_heap_detatch(struct dma_buf *dmabuf,
+			     struct dma_buf_attachment *attachment)
+{
+	struct cma_heap_buffer *buffer = dmabuf->priv;
+	struct dma_heap_attachment *a = attachment->priv;
+
+	mutex_lock(&buffer->lock);
+	list_del(&a->list);
+	mutex_unlock(&buffer->lock);
+
+	sg_free_table(&a->table);
+	kfree(a);
+}
+
+static struct sg_table *cma_heap_map_dma_buf(struct dma_buf_attachment *attachment,
+					     enum dma_data_direction direction)
+{
+	struct dma_heap_attachment *a = attachment->priv;
+	struct sg_table *table = &a->table;
+
+	if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
+			direction))
+		table = ERR_PTR(-ENOMEM);
+	return table;
+}
+
+static void cma_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
+				   struct sg_table *table,
+				   enum dma_data_direction direction)
+{
+	dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
+}
+
+static int cma_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
+					     enum dma_data_direction direction)
+{
+	struct cma_heap_buffer *buffer = dmabuf->priv;
+	struct dma_heap_attachment *a;
+	int ret = 0;
+
+	if (buffer->vmap_cnt)
+		invalidate_kernel_vmap_range(buffer->vaddr, buffer->len);
+
+	mutex_lock(&buffer->lock);
+	list_for_each_entry(a, &buffer->attachments, list) {
+		dma_sync_sg_for_cpu(a->dev, a->table.sgl, a->table.nents,
+				    direction);
+	}
+	mutex_unlock(&buffer->lock);
+
+	return ret;
+}
+
+static int cma_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
+					   enum dma_data_direction direction)
+{
+	struct cma_heap_buffer *buffer = dmabuf->priv;
+	struct dma_heap_attachment *a;
+
+	if (buffer->vmap_cnt)
+		flush_kernel_vmap_range(buffer->vaddr, buffer->len);
+
+	mutex_lock(&buffer->lock);
+	list_for_each_entry(a, &buffer->attachments, list) {
+		dma_sync_sg_for_device(a->dev, a->table.sgl, a->table.nents,
+				       direction);
+	}
+	mutex_unlock(&buffer->lock);
+
+	return 0;
+}
+
+static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf)
+{
+	struct vm_area_struct *vma = vmf->vma;
+	struct cma_heap_buffer *buffer = vma->vm_private_data;
+
+	if (vmf->pgoff > buffer->pagecount)
+		return VM_FAULT_SIGBUS;
+
+	vmf->page = buffer->pages[vmf->pgoff];
+	get_page(vmf->page);
+
+	return 0;
+}
+
+static const struct vm_operations_struct dma_heap_vm_ops = {
+	.fault = cma_heap_vm_fault,
+};
+
+static int cma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
+{
+	struct cma_heap_buffer *buffer = dmabuf->priv;
+
+	if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
+		return -EINVAL;
+
+	vma->vm_ops = &dma_heap_vm_ops;
+	vma->vm_private_data = buffer;
+
+	return 0;
+}
+
+static void *cma_heap_do_vmap(struct cma_heap_buffer *buffer)
+{
+	void *vaddr;
+
+	vaddr = vmap(buffer->pages, buffer->pagecount, VM_MAP, PAGE_KERNEL);
+	if (!vaddr)
+		return ERR_PTR(-ENOMEM);
+
+	return vaddr;
+}
+
+static void *cma_heap_vmap(struct dma_buf *dmabuf)
+{
+	struct cma_heap_buffer *buffer = dmabuf->priv;
+	void *vaddr;
+
+	mutex_lock(&buffer->lock);
+	if (buffer->vmap_cnt) {
+		buffer->vmap_cnt++;
+		return buffer->vaddr;
+	}
+
+	vaddr = cma_heap_do_vmap(buffer);
+	if (IS_ERR(vaddr))
+		return vaddr;
+
+	buffer->vaddr = vaddr;
+	buffer->vmap_cnt++;
+	mutex_unlock(&buffer->lock);
+
+	return vaddr;
+}
+
+static void cma_heap_vunmap(struct dma_buf *dmabuf, void *vaddr)
+{
+	struct cma_heap_buffer *buffer = dmabuf->priv;
+
+	mutex_lock(&buffer->lock);
+	if (!--buffer->vmap_cnt) {
+		vunmap(buffer->vaddr);
+		buffer->vaddr = NULL;
+	}
+	mutex_unlock(&buffer->lock);
+}
+
+static void cma_heap_dma_buf_release(struct dma_buf *dmabuf)
+{
+	struct cma_heap_buffer *buffer = dmabuf->priv;
+	struct cma_heap *cma_heap = buffer->heap;
+
+	if (buffer->vmap_cnt > 0) {
+		WARN(1, "%s: buffer still mapped in the kernel\n", __func__);
+		vunmap(buffer->vaddr);
+	}
+
+	cma_release(cma_heap->cma, buffer->cma_pages, buffer->pagecount);
 	kfree(buffer);
 }
 
-/* dmabuf heap CMA operations functions */
+const struct dma_buf_ops cma_heap_buf_ops = {
+	.attach = cma_heap_attach,
+	.detach = cma_heap_detatch,
+	.map_dma_buf = cma_heap_map_dma_buf,
+	.unmap_dma_buf = cma_heap_unmap_dma_buf,
+	.begin_cpu_access = cma_heap_dma_buf_begin_cpu_access,
+	.end_cpu_access = cma_heap_dma_buf_end_cpu_access,
+	.mmap = cma_heap_mmap,
+	.vmap = cma_heap_vmap,
+	.vunmap = cma_heap_vunmap,
+	.release = cma_heap_dma_buf_release,
+};
+
 static int cma_heap_allocate(struct dma_heap *heap,
-			     unsigned long len,
-			     unsigned long fd_flags,
-			     unsigned long heap_flags)
+				  unsigned long len,
+				  unsigned long fd_flags,
+				  unsigned long heap_flags)
 {
 	struct cma_heap *cma_heap = dma_heap_get_drvdata(heap);
-	struct heap_helper_buffer *helper_buffer;
-	struct page *cma_pages;
+	struct cma_heap_buffer *buffer;
+	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
 	size_t size = PAGE_ALIGN(len);
-	unsigned long nr_pages = size >> PAGE_SHIFT;
+	pgoff_t pagecount = size >> PAGE_SHIFT;
 	unsigned long align = get_order(size);
+	struct page *cma_pages;
 	struct dma_buf *dmabuf;
-	int ret = -ENOMEM;
 	pgoff_t pg;
+	int ret;
 
-	if (align > CONFIG_CMA_ALIGNMENT)
-		align = CONFIG_CMA_ALIGNMENT;
-
-	helper_buffer = kzalloc(sizeof(*helper_buffer), GFP_KERNEL);
-	if (!helper_buffer)
+	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
+	if (!buffer)
 		return -ENOMEM;
 
-	init_heap_helper_buffer(helper_buffer, cma_heap_free);
-	helper_buffer->heap = heap;
-	helper_buffer->size = len;
+	INIT_LIST_HEAD(&buffer->attachments);
+	mutex_init(&buffer->lock);
+	buffer->len = size;
 
-	cma_pages = cma_alloc(cma_heap->cma, nr_pages, align, false);
+	if (align > CONFIG_CMA_ALIGNMENT)
+		align = CONFIG_CMA_ALIGNMENT;
+
+	cma_pages = cma_alloc(cma_heap->cma, pagecount, align, false);
 	if (!cma_pages)
-		goto free_buf;
+		goto free_buffer;
 
+	/* Clear the cma pages */
 	if (PageHighMem(cma_pages)) {
-		unsigned long nr_clear_pages = nr_pages;
+		unsigned long nr_clear_pages = pagecount;
 		struct page *page = cma_pages;
 
 		while (nr_clear_pages > 0) {
@@ -85,7 +300,6 @@ static int cma_heap_allocate(struct dma_heap *heap,
 			 */
 			if (fatal_signal_pending(current))
 				goto free_cma;
-
 			page++;
 			nr_clear_pages--;
 		}
@@ -93,28 +307,30 @@ static int cma_heap_allocate(struct dma_heap *heap,
 		memset(page_address(cma_pages), 0, size);
 	}
 
-	helper_buffer->pagecount = nr_pages;
-	helper_buffer->pages = kmalloc_array(helper_buffer->pagecount,
-					     sizeof(*helper_buffer->pages),
-					     GFP_KERNEL);
-	if (!helper_buffer->pages) {
+	buffer->pages = kmalloc_array(pagecount, sizeof(*buffer->pages), GFP_KERNEL);
+	if (!buffer->pages) {
 		ret = -ENOMEM;
 		goto free_cma;
 	}
 
-	for (pg = 0; pg < helper_buffer->pagecount; pg++)
-		helper_buffer->pages[pg] = &cma_pages[pg];
+	for (pg = 0; pg < pagecount; pg++)
+		buffer->pages[pg] = &cma_pages[pg];
+
+	buffer->cma_pages = cma_pages;
+	buffer->heap = cma_heap;
+	buffer->pagecount = pagecount;
 
 	/* create the dmabuf */
-	dmabuf = heap_helper_export_dmabuf(helper_buffer, fd_flags);
+	exp_info.ops = &cma_heap_buf_ops;
+	exp_info.size = buffer->len;
+	exp_info.flags = fd_flags;
+	exp_info.priv = buffer;
+	dmabuf = dma_buf_export(&exp_info);
 	if (IS_ERR(dmabuf)) {
 		ret = PTR_ERR(dmabuf);
 		goto free_pages;
 	}
 
-	helper_buffer->dmabuf = dmabuf;
-	helper_buffer->priv_virt = cma_pages;
-
 	ret = dma_buf_fd(dmabuf, fd_flags);
 	if (ret < 0) {
 		dma_buf_put(dmabuf);
@@ -125,15 +341,16 @@ static int cma_heap_allocate(struct dma_heap *heap,
 	return ret;
 
 free_pages:
-	kfree(helper_buffer->pages);
+	kfree(buffer->pages);
 free_cma:
-	cma_release(cma_heap->cma, cma_pages, nr_pages);
-free_buf:
-	kfree(helper_buffer);
+	cma_release(cma_heap->cma, cma_pages, pagecount);
+free_buffer:
+	kfree(buffer);
+
 	return ret;
 }
 
-static const struct dma_heap_ops cma_heap_ops = {
+static struct dma_heap_ops cma_heap_ops = {
 	.allocate = cma_heap_allocate,
 };
 
@@ -175,3 +392,4 @@ static int add_default_cma_heap(void)
 module_init(add_default_cma_heap);
 MODULE_DESCRIPTION("DMA-BUF CMA Heap");
 MODULE_LICENSE("GPL v2");
+
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC][PATCH 3/6] dma-buf: heaps: Remove heap-helpers code
  2020-09-26  4:24 [RFC][PATCH 0/6] dma-buf: Performance improvements for system heap John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 1/6] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 2/6] dma-buf: heaps: Move heap-helper logic into the cma_heap implementation John Stultz
@ 2020-09-26  4:24 ` John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 4/6] dma-buf: system_heap: Allocate higher order pages if available John Stultz
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2020-09-26  4:24 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Sumit Semwal, Liam Mark, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Suren Baghdasaryan,
	Sandeep Patil, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

The heap-helpers code was not as generic as initially hoped
and it is now not being used, so remove it from the tree.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/dma-buf/heaps/Makefile       |   1 -
 drivers/dma-buf/heaps/heap-helpers.c | 271 ---------------------------
 drivers/dma-buf/heaps/heap-helpers.h |  53 ------
 3 files changed, 325 deletions(-)
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h

diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
index 6e54cdec3da0..974467791032 100644
--- a/drivers/dma-buf/heaps/Makefile
+++ b/drivers/dma-buf/heaps/Makefile
@@ -1,4 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
-obj-y					+= heap-helpers.o
 obj-$(CONFIG_DMABUF_HEAPS_SYSTEM)	+= system_heap.o
 obj-$(CONFIG_DMABUF_HEAPS_CMA)		+= cma_heap.o
diff --git a/drivers/dma-buf/heaps/heap-helpers.c b/drivers/dma-buf/heaps/heap-helpers.c
deleted file mode 100644
index 9f964ca3f59c..000000000000
--- a/drivers/dma-buf/heaps/heap-helpers.c
+++ /dev/null
@@ -1,271 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <linux/device.h>
-#include <linux/dma-buf.h>
-#include <linux/err.h>
-#include <linux/highmem.h>
-#include <linux/idr.h>
-#include <linux/list.h>
-#include <linux/slab.h>
-#include <linux/uaccess.h>
-#include <linux/vmalloc.h>
-#include <uapi/linux/dma-heap.h>
-
-#include "heap-helpers.h"
-
-void init_heap_helper_buffer(struct heap_helper_buffer *buffer,
-			     void (*free)(struct heap_helper_buffer *))
-{
-	buffer->priv_virt = NULL;
-	mutex_init(&buffer->lock);
-	buffer->vmap_cnt = 0;
-	buffer->vaddr = NULL;
-	buffer->pagecount = 0;
-	buffer->pages = NULL;
-	INIT_LIST_HEAD(&buffer->attachments);
-	buffer->free = free;
-}
-
-struct dma_buf *heap_helper_export_dmabuf(struct heap_helper_buffer *buffer,
-					  int fd_flags)
-{
-	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
-
-	exp_info.ops = &heap_helper_ops;
-	exp_info.size = buffer->size;
-	exp_info.flags = fd_flags;
-	exp_info.priv = buffer;
-
-	return dma_buf_export(&exp_info);
-}
-
-static void *dma_heap_map_kernel(struct heap_helper_buffer *buffer)
-{
-	void *vaddr;
-
-	vaddr = vmap(buffer->pages, buffer->pagecount, VM_MAP, PAGE_KERNEL);
-	if (!vaddr)
-		return ERR_PTR(-ENOMEM);
-
-	return vaddr;
-}
-
-static void dma_heap_buffer_destroy(struct heap_helper_buffer *buffer)
-{
-	if (buffer->vmap_cnt > 0) {
-		WARN(1, "%s: buffer still mapped in the kernel\n", __func__);
-		vunmap(buffer->vaddr);
-	}
-
-	buffer->free(buffer);
-}
-
-static void *dma_heap_buffer_vmap_get(struct heap_helper_buffer *buffer)
-{
-	void *vaddr;
-
-	if (buffer->vmap_cnt) {
-		buffer->vmap_cnt++;
-		return buffer->vaddr;
-	}
-	vaddr = dma_heap_map_kernel(buffer);
-	if (IS_ERR(vaddr))
-		return vaddr;
-	buffer->vaddr = vaddr;
-	buffer->vmap_cnt++;
-	return vaddr;
-}
-
-static void dma_heap_buffer_vmap_put(struct heap_helper_buffer *buffer)
-{
-	if (!--buffer->vmap_cnt) {
-		vunmap(buffer->vaddr);
-		buffer->vaddr = NULL;
-	}
-}
-
-struct dma_heaps_attachment {
-	struct device *dev;
-	struct sg_table table;
-	struct list_head list;
-};
-
-static int dma_heap_attach(struct dma_buf *dmabuf,
-			   struct dma_buf_attachment *attachment)
-{
-	struct dma_heaps_attachment *a;
-	struct heap_helper_buffer *buffer = dmabuf->priv;
-	int ret;
-
-	a = kzalloc(sizeof(*a), GFP_KERNEL);
-	if (!a)
-		return -ENOMEM;
-
-	ret = sg_alloc_table_from_pages(&a->table, buffer->pages,
-					buffer->pagecount, 0,
-					buffer->pagecount << PAGE_SHIFT,
-					GFP_KERNEL);
-	if (ret) {
-		kfree(a);
-		return ret;
-	}
-
-	a->dev = attachment->dev;
-	INIT_LIST_HEAD(&a->list);
-
-	attachment->priv = a;
-
-	mutex_lock(&buffer->lock);
-	list_add(&a->list, &buffer->attachments);
-	mutex_unlock(&buffer->lock);
-
-	return 0;
-}
-
-static void dma_heap_detach(struct dma_buf *dmabuf,
-			    struct dma_buf_attachment *attachment)
-{
-	struct dma_heaps_attachment *a = attachment->priv;
-	struct heap_helper_buffer *buffer = dmabuf->priv;
-
-	mutex_lock(&buffer->lock);
-	list_del(&a->list);
-	mutex_unlock(&buffer->lock);
-
-	sg_free_table(&a->table);
-	kfree(a);
-}
-
-static
-struct sg_table *dma_heap_map_dma_buf(struct dma_buf_attachment *attachment,
-				      enum dma_data_direction direction)
-{
-	struct dma_heaps_attachment *a = attachment->priv;
-	struct sg_table *table;
-
-	table = &a->table;
-
-	if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
-			direction))
-		table = ERR_PTR(-ENOMEM);
-	return table;
-}
-
-static void dma_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
-				   struct sg_table *table,
-				   enum dma_data_direction direction)
-{
-	dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
-}
-
-static vm_fault_t dma_heap_vm_fault(struct vm_fault *vmf)
-{
-	struct vm_area_struct *vma = vmf->vma;
-	struct heap_helper_buffer *buffer = vma->vm_private_data;
-
-	if (vmf->pgoff > buffer->pagecount)
-		return VM_FAULT_SIGBUS;
-
-	vmf->page = buffer->pages[vmf->pgoff];
-	get_page(vmf->page);
-
-	return 0;
-}
-
-static const struct vm_operations_struct dma_heap_vm_ops = {
-	.fault = dma_heap_vm_fault,
-};
-
-static int dma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
-{
-	struct heap_helper_buffer *buffer = dmabuf->priv;
-
-	if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
-		return -EINVAL;
-
-	vma->vm_ops = &dma_heap_vm_ops;
-	vma->vm_private_data = buffer;
-
-	return 0;
-}
-
-static void dma_heap_dma_buf_release(struct dma_buf *dmabuf)
-{
-	struct heap_helper_buffer *buffer = dmabuf->priv;
-
-	dma_heap_buffer_destroy(buffer);
-}
-
-static int dma_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
-					     enum dma_data_direction direction)
-{
-	struct heap_helper_buffer *buffer = dmabuf->priv;
-	struct dma_heaps_attachment *a;
-	int ret = 0;
-
-	mutex_lock(&buffer->lock);
-
-	if (buffer->vmap_cnt)
-		invalidate_kernel_vmap_range(buffer->vaddr, buffer->size);
-
-	list_for_each_entry(a, &buffer->attachments, list) {
-		dma_sync_sg_for_cpu(a->dev, a->table.sgl, a->table.nents,
-				    direction);
-	}
-	mutex_unlock(&buffer->lock);
-
-	return ret;
-}
-
-static int dma_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
-					   enum dma_data_direction direction)
-{
-	struct heap_helper_buffer *buffer = dmabuf->priv;
-	struct dma_heaps_attachment *a;
-
-	mutex_lock(&buffer->lock);
-
-	if (buffer->vmap_cnt)
-		flush_kernel_vmap_range(buffer->vaddr, buffer->size);
-
-	list_for_each_entry(a, &buffer->attachments, list) {
-		dma_sync_sg_for_device(a->dev, a->table.sgl, a->table.nents,
-				       direction);
-	}
-	mutex_unlock(&buffer->lock);
-
-	return 0;
-}
-
-static void *dma_heap_dma_buf_vmap(struct dma_buf *dmabuf)
-{
-	struct heap_helper_buffer *buffer = dmabuf->priv;
-	void *vaddr;
-
-	mutex_lock(&buffer->lock);
-	vaddr = dma_heap_buffer_vmap_get(buffer);
-	mutex_unlock(&buffer->lock);
-
-	return vaddr;
-}
-
-static void dma_heap_dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr)
-{
-	struct heap_helper_buffer *buffer = dmabuf->priv;
-
-	mutex_lock(&buffer->lock);
-	dma_heap_buffer_vmap_put(buffer);
-	mutex_unlock(&buffer->lock);
-}
-
-const struct dma_buf_ops heap_helper_ops = {
-	.map_dma_buf = dma_heap_map_dma_buf,
-	.unmap_dma_buf = dma_heap_unmap_dma_buf,
-	.mmap = dma_heap_mmap,
-	.release = dma_heap_dma_buf_release,
-	.attach = dma_heap_attach,
-	.detach = dma_heap_detach,
-	.begin_cpu_access = dma_heap_dma_buf_begin_cpu_access,
-	.end_cpu_access = dma_heap_dma_buf_end_cpu_access,
-	.vmap = dma_heap_dma_buf_vmap,
-	.vunmap = dma_heap_dma_buf_vunmap,
-};
diff --git a/drivers/dma-buf/heaps/heap-helpers.h b/drivers/dma-buf/heaps/heap-helpers.h
deleted file mode 100644
index 805d2df88024..000000000000
--- a/drivers/dma-buf/heaps/heap-helpers.h
+++ /dev/null
@@ -1,53 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * DMABUF Heaps helper code
- *
- * Copyright (C) 2011 Google, Inc.
- * Copyright (C) 2019 Linaro Ltd.
- */
-
-#ifndef _HEAP_HELPERS_H
-#define _HEAP_HELPERS_H
-
-#include <linux/dma-heap.h>
-#include <linux/list.h>
-
-/**
- * struct heap_helper_buffer - helper buffer metadata
- * @heap:		back pointer to the heap the buffer came from
- * @dmabuf:		backing dma-buf for this buffer
- * @size:		size of the buffer
- * @priv_virt		pointer to heap specific private value
- * @lock		mutext to protect the data in this structure
- * @vmap_cnt		count of vmap references on the buffer
- * @vaddr		vmap'ed virtual address
- * @pagecount		number of pages in the buffer
- * @pages		list of page pointers
- * @attachments		list of device attachments
- *
- * @free		heap callback to free the buffer
- */
-struct heap_helper_buffer {
-	struct dma_heap *heap;
-	struct dma_buf *dmabuf;
-	size_t size;
-
-	void *priv_virt;
-	struct mutex lock;
-	int vmap_cnt;
-	void *vaddr;
-	pgoff_t pagecount;
-	struct page **pages;
-	struct list_head attachments;
-
-	void (*free)(struct heap_helper_buffer *buffer);
-};
-
-void init_heap_helper_buffer(struct heap_helper_buffer *buffer,
-			     void (*free)(struct heap_helper_buffer *));
-
-struct dma_buf *heap_helper_export_dmabuf(struct heap_helper_buffer *buffer,
-					  int fd_flags);
-
-extern const struct dma_buf_ops heap_helper_ops;
-#endif /* _HEAP_HELPERS_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC][PATCH 4/6] dma-buf: system_heap: Allocate higher order pages if available
  2020-09-26  4:24 [RFC][PATCH 0/6] dma-buf: Performance improvements for system heap John Stultz
                   ` (2 preceding siblings ...)
  2020-09-26  4:24 ` [RFC][PATCH 3/6] dma-buf: heaps: Remove heap-helpers code John Stultz
@ 2020-09-26  4:24 ` John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap John Stultz
  2020-09-26  4:24 ` [RFC][PATCH 6/6] dma-buf: heaps: Skip sync if not mapped John Stultz
  5 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2020-09-26  4:24 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Sumit Semwal, Liam Mark, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Suren Baghdasaryan,
	Sandeep Patil, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

While the system heap can return non-contiguous pages,
try to allocate larger order pages if possible.

This will allow slight performance gains and make implementing
page pooling easier.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/dma-buf/heaps/system_heap.c | 85 ++++++++++++++++++++++-------
 1 file changed, 66 insertions(+), 19 deletions(-)

diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index ddfa17dc48a8..882a632e9bb7 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -39,6 +39,14 @@ struct dma_heap_attachment {
 	struct list_head list;
 };
 
+#define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
+				| __GFP_NORETRY) & ~__GFP_RECLAIM) \
+				| __GFP_COMP)
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
+static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
+static const unsigned int orders[] = {8, 4, 0};
+#define NUM_ORDERS ARRAY_SIZE(orders)
+
 static struct sg_table *dup_sg_table(struct sg_table *table)
 {
 	struct sg_table *new_table;
@@ -259,8 +267,11 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
 	int i;
 
 	table = &buffer->sg_table;
-	for_each_sgtable_sg(table, sg, i)
-		__free_page(sg_page(sg));
+	for_each_sg(table->sgl, sg, table->nents, i) {
+		struct page *page = sg_page(sg);
+
+		__free_pages(page, compound_order(page));
+	}
 	sg_free_table(table);
 	kfree(buffer);
 }
@@ -278,6 +289,26 @@ const struct dma_buf_ops system_heap_buf_ops = {
 	.release = system_heap_dma_buf_release,
 };
 
+static struct page *alloc_largest_available(unsigned long size,
+					    unsigned int max_order)
+{
+	struct page *page;
+	int i;
+
+	for (i = 0; i < NUM_ORDERS; i++) {
+		if (size <  (PAGE_SIZE << orders[i]))
+			continue;
+		if (max_order < orders[i])
+			continue;
+
+		page = alloc_pages(order_flags[i], orders[i]);
+		if (!page)
+			continue;
+		return page;
+	}
+	return NULL;
+}
+
 static int system_heap_allocate(struct dma_heap *heap,
 				unsigned long len,
 				unsigned long fd_flags,
@@ -285,11 +316,13 @@ static int system_heap_allocate(struct dma_heap *heap,
 {
 	struct system_heap_buffer *buffer;
 	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
+	unsigned long size_remaining = len;
+	unsigned int max_order = orders[0];
 	struct dma_buf *dmabuf;
 	struct sg_table *table;
 	struct scatterlist *sg;
-	pgoff_t pagecount;
-	pgoff_t pg;
+	struct list_head pages;
+	struct page *page, *tmp_page;
 	int i, ret = -ENOMEM;
 
 	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
@@ -301,25 +334,35 @@ static int system_heap_allocate(struct dma_heap *heap,
 	buffer->heap = heap;
 	buffer->len = len;
 
-	table = &buffer->sg_table;
-	pagecount = len / PAGE_SIZE;
-	if (sg_alloc_table(table, pagecount, GFP_KERNEL))
-		goto free_buffer;
-
-	sg = table->sgl;
-	for (pg = 0; pg < pagecount; pg++) {
-		struct page *page;
+	INIT_LIST_HEAD(&pages);
+	i = 0;
+	while (size_remaining > 0) {
 		/*
 		 * Avoid trying to allocate memory if the process
 		 * has been killed by SIGKILL
 		 */
 		if (fatal_signal_pending(current))
-			goto free_pages;
-		page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+			goto free_buffer;
+
+		page = alloc_largest_available(size_remaining, max_order);
 		if (!page)
-			goto free_pages;
-		sg_set_page(sg, page, page_size(page), 0);
+			goto free_buffer;
+
+		list_add_tail(&page->lru, &pages);
+		size_remaining -= PAGE_SIZE << compound_order(page);
+		max_order = compound_order(page);
+		i++;
+	}
+
+	table = &buffer->sg_table;
+	if (sg_alloc_table(table, i, GFP_KERNEL))
+		goto free_buffer;
+
+	sg = table->sgl;
+	list_for_each_entry_safe(page, tmp_page, &pages, lru) {
+		sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0);
 		sg = sg_next(sg);
+		list_del(&page->lru);
 	}
 
 	/* create the dmabuf */
@@ -339,14 +382,18 @@ static int system_heap_allocate(struct dma_heap *heap,
 		/* just return, as put will call release and that will free */
 		return ret;
 	}
-
 	return ret;
 
 free_pages:
-	for_each_sgtable_sg(table, sg, i)
-		__free_page(sg_page(sg));
+	for_each_sgtable_sg(table, sg, i) {
+		struct page *p = sg_page(sg);
+
+		__free_pages(p, compound_order(p));
+	}
 	sg_free_table(table);
 free_buffer:
+	list_for_each_entry_safe(page, tmp_page, &pages, lru)
+		__free_pages(page, compound_order(page));
 	kfree(buffer);
 
 	return ret;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap
  2020-09-26  4:24 [RFC][PATCH 0/6] dma-buf: Performance improvements for system heap John Stultz
                   ` (3 preceding siblings ...)
  2020-09-26  4:24 ` [RFC][PATCH 4/6] dma-buf: system_heap: Allocate higher order pages if available John Stultz
@ 2020-09-26  4:24 ` John Stultz
  2020-09-30  4:46   ` Chris Goldsworthy
  2020-09-26  4:24 ` [RFC][PATCH 6/6] dma-buf: heaps: Skip sync if not mapped John Stultz
  5 siblings, 1 reply; 11+ messages in thread
From: John Stultz @ 2020-09-26  4:24 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Sumit Semwal, Liam Mark, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Suren Baghdasaryan,
	Sandeep Patil, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

Reuse/abuse the pagepool code from the network code to speed
up allocation performance.

This is similar to the ION pagepool usage, but tries to
utilize generic code instead of a custom implementation.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/dma-buf/heaps/Kconfig       |  1 +
 drivers/dma-buf/heaps/system_heap.c | 32 +++++++++++++++++++++++++----
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index a5eef06c4226..f13cde4321b1 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -1,6 +1,7 @@
 config DMABUF_HEAPS_SYSTEM
 	bool "DMA-BUF System Heap"
 	depends on DMABUF_HEAPS
+	select PAGE_POOL
 	help
 	  Choose this option to enable the system dmabuf heap. The system heap
 	  is backed by pages from the buddy allocator. If in doubt, say Y.
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 882a632e9bb7..9f57b4c8ae69 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -20,6 +20,7 @@
 #include <linux/scatterlist.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <net/page_pool.h>
 
 struct dma_heap *sys_heap;
 
@@ -46,6 +47,7 @@ struct dma_heap_attachment {
 static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
 static const unsigned int orders[] = {8, 4, 0};
 #define NUM_ORDERS ARRAY_SIZE(orders)
+struct page_pool *pools[NUM_ORDERS];
 
 static struct sg_table *dup_sg_table(struct sg_table *table)
 {
@@ -264,13 +266,17 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
 	struct system_heap_buffer *buffer = dmabuf->priv;
 	struct sg_table *table;
 	struct scatterlist *sg;
-	int i;
+	int i, j;
 
 	table = &buffer->sg_table;
 	for_each_sg(table->sgl, sg, table->nents, i) {
 		struct page *page = sg_page(sg);
 
-		__free_pages(page, compound_order(page));
+		for (j = 0; j < NUM_ORDERS; j++) {
+			if (compound_order(page) == orders[j])
+				break;
+		}
+		page_pool_put_full_page(pools[j], page, false);
 	}
 	sg_free_table(table);
 	kfree(buffer);
@@ -300,8 +306,7 @@ static struct page *alloc_largest_available(unsigned long size,
 			continue;
 		if (max_order < orders[i])
 			continue;
-
-		page = alloc_pages(order_flags[i], orders[i]);
+		page = page_pool_alloc_pages(pools[i], order_flags[i]);
 		if (!page)
 			continue;
 		return page;
@@ -406,6 +411,25 @@ static const struct dma_heap_ops system_heap_ops = {
 static int system_heap_create(void)
 {
 	struct dma_heap_export_info exp_info;
+	int i;
+
+	for (i = 0; i < NUM_ORDERS; i++) {
+		struct page_pool_params pp;
+
+		memset(&pp, 0, sizeof(pp));
+		pp.order = orders[i];
+		pp.dma_dir = DMA_BIDIRECTIONAL;
+		pools[i] = page_pool_create(&pp);
+
+		if (IS_ERR(pools[i])) {
+			int j;
+
+			pr_err("%s: page pool creation failed!\n", __func__);
+			for (j = 0; j < i; j++)
+				page_pool_destroy(pools[j]);
+			return PTR_ERR(pools[i]);
+		}
+	}
 
 	exp_info.name = "system";
 	exp_info.ops = &system_heap_ops;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC][PATCH 6/6] dma-buf: heaps: Skip sync if not mapped
  2020-09-26  4:24 [RFC][PATCH 0/6] dma-buf: Performance improvements for system heap John Stultz
                   ` (4 preceding siblings ...)
  2020-09-26  4:24 ` [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap John Stultz
@ 2020-09-26  4:24 ` John Stultz
  5 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2020-09-26  4:24 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Sumit Semwal, Liam Mark, Laura Abbott,
	Brian Starkey, Hridya Valsaraju, Suren Baghdasaryan,
	Sandeep Patil, Ørjan Eide, Robin Murphy, Ezequiel Garcia,
	Simon Ser, James Jones, linux-media, dri-devel

This patch is basically a port of Ørjan Eide's similar patch for ION
 https://lore.kernel.org/lkml/20200414134629.54567-1-orjan.eide@arm.com/

Only sync the sg-list of dma-buf heap attachment when the attachment
is actually mapped on the device.

dma-bufs may be synced at any time. It can be reached from user space
via DMA_BUF_IOCTL_SYNC, so there are no guarantees from callers on when
syncs may be attempted, and dma_buf_end_cpu_access() and
dma_buf_begin_cpu_access() may not be paired.

Since the sg_list's dma_address isn't set up until the buffer is used
on the device, and dma_map_sg() is called on it, the dma_address will be
NULL if sync is attempted on the dma-buf before it's mapped on a device.

Before v5.0 (commit 55897af63091 ("dma-direct: merge swiotlb_dma_ops
into the dma_direct code")) this was a problem as the dma-api (at least
the swiotlb_dma_ops on arm64) would use the potentially invalid
dma_address. How that failed depended on how the device handled physical
address 0. If 0 was a valid address to physical ram, that page would get
flushed a lot, while the actual pages in the buffer would not get synced
correctly. While if 0 is an invalid physical address it may cause a
fault and trigger a crash.

In v5.0 this was incidentally fixed by commit 55897af63091 ("dma-direct:
merge swiotlb_dma_ops into the dma_direct code"), as this moved the
dma-api to use the page pointer in the sg_list, and (for Ion buffers at
least) this will always be valid if the sg_list exists at all.

But, this issue is re-introduced in v5.3 with
commit 449fa54d6815 ("dma-direct: correct the physical addr in
dma_direct_sync_sg_for_cpu/device") moves the dma-api back to the old
behaviour and picks the dma_address that may be invalid.

dma-buf core doesn't ensure that the buffer is mapped on the device, and
thus have a valid sg_list, before calling the exporter's
begin_cpu_access.

Logic and commit message originally by: Ørjan Eide <orjan.eide@arm.com>

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/dma-buf/heaps/cma_heap.c    | 10 ++++++++++
 drivers/dma-buf/heaps/system_heap.c | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 3adfdbed0829..b6ab0392ad9a 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -44,6 +44,7 @@ struct dma_heap_attachment {
 	struct device *dev;
 	struct sg_table table;
 	struct list_head list;
+	bool mapped;
 };
 
 static int cma_heap_attach(struct dma_buf *dmabuf,
@@ -68,6 +69,7 @@ static int cma_heap_attach(struct dma_buf *dmabuf,
 
 	a->dev = attachment->dev;
 	INIT_LIST_HEAD(&a->list);
+	a->mapped = false;
 
 	attachment->priv = a;
 
@@ -101,6 +103,7 @@ static struct sg_table *cma_heap_map_dma_buf(struct dma_buf_attachment *attachme
 	if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
 			direction))
 		table = ERR_PTR(-ENOMEM);
+	a->mapped = true;
 	return table;
 }
 
@@ -108,6 +111,9 @@ static void cma_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
 				   struct sg_table *table,
 				   enum dma_data_direction direction)
 {
+	struct dma_heap_attachment *a = attachment->priv;
+
+	a->mapped = false;
 	dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
 }
 
@@ -123,6 +129,8 @@ static int cma_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
 
 	mutex_lock(&buffer->lock);
 	list_for_each_entry(a, &buffer->attachments, list) {
+		if (!a->mapped)
+			continue;
 		dma_sync_sg_for_cpu(a->dev, a->table.sgl, a->table.nents,
 				    direction);
 	}
@@ -142,6 +150,8 @@ static int cma_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
 
 	mutex_lock(&buffer->lock);
 	list_for_each_entry(a, &buffer->attachments, list) {
+		if (!a->mapped)
+			continue;
 		dma_sync_sg_for_device(a->dev, a->table.sgl, a->table.nents,
 				       direction);
 	}
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 9f57b4c8ae69..8a523b6fd51a 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -38,6 +38,7 @@ struct dma_heap_attachment {
 	struct device *dev;
 	struct sg_table *table;
 	struct list_head list;
+	bool mapped;
 };
 
 #define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
@@ -94,6 +95,7 @@ static int system_heap_attach(struct dma_buf *dmabuf,
 	a->table = table;
 	a->dev = attachment->dev;
 	INIT_LIST_HEAD(&a->list);
+	a->mapped = false;
 
 	attachment->priv = a;
 
@@ -128,6 +130,7 @@ static struct sg_table *system_heap_map_dma_buf(struct dma_buf_attachment *attac
 	if (!dma_map_sg(attachment->dev, table->sgl, table->nents, direction))
 		return ERR_PTR(-ENOMEM);
 
+	a->mapped = true;
 	return table;
 }
 
@@ -135,6 +138,9 @@ static void system_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
 				      struct sg_table *table,
 				      enum dma_data_direction direction)
 {
+	struct dma_heap_attachment *a = attachment->priv;
+
+	a->mapped = false;
 	dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
 }
 
@@ -151,6 +157,8 @@ static int system_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
 		invalidate_kernel_vmap_range(buffer->vaddr, buffer->len);
 
 	list_for_each_entry(a, &buffer->attachments, list) {
+		if (!a->mapped)
+			continue;
 		dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
 				    direction);
 	}
@@ -171,6 +179,8 @@ static int system_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
 		flush_kernel_vmap_range(buffer->vaddr, buffer->len);
 
 	list_for_each_entry(a, &buffer->attachments, list) {
+		if (!a->mapped)
+			continue;
 		dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
 				       direction);
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap
  2020-09-26  4:24 ` [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap John Stultz
@ 2020-09-30  4:46   ` Chris Goldsworthy
  2020-10-01 14:49     ` Chris Goldsworthy
  2020-10-01 22:07     ` John Stultz
  0 siblings, 2 replies; 11+ messages in thread
From: Chris Goldsworthy @ 2020-09-30  4:46 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Sumit Semwal, Liam Mark, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Ørjan Eide, Robin Murphy, Ezequiel Garcia, Simon Ser,
	James Jones, linux-media, dri-devel

On 2020-09-25 21:24, John Stultz wrote:
> Reuse/abuse the pagepool code from the network code to speed
> up allocation performance.
> 
> This is similar to the ION pagepool usage, but tries to
> utilize generic code instead of a custom implementation.
> 
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Liam Mark <lmark@codeaurora.org>
> Cc: Laura Abbott <labbott@kernel.org>
> Cc: Brian Starkey <Brian.Starkey@arm.com>
> Cc: Hridya Valsaraju <hridya@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Sandeep Patil <sspatil@google.com>
> Cc: Ørjan Eide <orjan.eide@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: James Jones <jajones@nvidia.com>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>  drivers/dma-buf/heaps/Kconfig       |  1 +
>  drivers/dma-buf/heaps/system_heap.c | 32 +++++++++++++++++++++++++----
>  2 files changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/dma-buf/heaps/Kconfig 
> b/drivers/dma-buf/heaps/Kconfig
> index a5eef06c4226..f13cde4321b1 100644
> --- a/drivers/dma-buf/heaps/Kconfig
> +++ b/drivers/dma-buf/heaps/Kconfig
> @@ -1,6 +1,7 @@
>  config DMABUF_HEAPS_SYSTEM
>  	bool "DMA-BUF System Heap"
>  	depends on DMABUF_HEAPS
> +	select PAGE_POOL
>  	help
>  	  Choose this option to enable the system dmabuf heap. The system 
> heap
>  	  is backed by pages from the buddy allocator. If in doubt, say Y.
> diff --git a/drivers/dma-buf/heaps/system_heap.c
> b/drivers/dma-buf/heaps/system_heap.c
> index 882a632e9bb7..9f57b4c8ae69 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -20,6 +20,7 @@
>  #include <linux/scatterlist.h>
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
> +#include <net/page_pool.h>
> 
>  struct dma_heap *sys_heap;
> 
> @@ -46,6 +47,7 @@ struct dma_heap_attachment {
>  static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, 
> LOW_ORDER_GFP};
>  static const unsigned int orders[] = {8, 4, 0};
>  #define NUM_ORDERS ARRAY_SIZE(orders)
> +struct page_pool *pools[NUM_ORDERS];
> 
>  static struct sg_table *dup_sg_table(struct sg_table *table)
>  {
> @@ -264,13 +266,17 @@ static void system_heap_dma_buf_release(struct
> dma_buf *dmabuf)
>  	struct system_heap_buffer *buffer = dmabuf->priv;
>  	struct sg_table *table;
>  	struct scatterlist *sg;
> -	int i;
> +	int i, j;
> 
>  	table = &buffer->sg_table;
>  	for_each_sg(table->sgl, sg, table->nents, i) {
>  		struct page *page = sg_page(sg);
> 
> -		__free_pages(page, compound_order(page));
> +		for (j = 0; j < NUM_ORDERS; j++) {
> +			if (compound_order(page) == orders[j])
> +				break;
> +		}
> +		page_pool_put_full_page(pools[j], page, false);
>  	}
>  	sg_free_table(table);
>  	kfree(buffer);
> @@ -300,8 +306,7 @@ static struct page
> *alloc_largest_available(unsigned long size,
>  			continue;
>  		if (max_order < orders[i])
>  			continue;
> -
> -		page = alloc_pages(order_flags[i], orders[i]);
> +		page = page_pool_alloc_pages(pools[i], order_flags[i]);
>  		if (!page)
>  			continue;
>  		return page;
> @@ -406,6 +411,25 @@ static const struct dma_heap_ops system_heap_ops = 
> {
>  static int system_heap_create(void)
>  {
>  	struct dma_heap_export_info exp_info;
> +	int i;
> +
> +	for (i = 0; i < NUM_ORDERS; i++) {
> +		struct page_pool_params pp;
> +
> +		memset(&pp, 0, sizeof(pp));
> +		pp.order = orders[i];
> +		pp.dma_dir = DMA_BIDIRECTIONAL;
> +		pools[i] = page_pool_create(&pp);
> +
> +		if (IS_ERR(pools[i])) {
> +			int j;
> +
> +			pr_err("%s: page pool creation failed!\n", __func__);
> +			for (j = 0; j < i; j++)
> +				page_pool_destroy(pools[j]);
> +			return PTR_ERR(pools[i]);
> +		}
> +	}
> 
>  	exp_info.name = "system";
>  	exp_info.ops = &system_heap_ops;

This is cool, I didn't know about this pooling code under /net/core.  
Nice and compact.

-- 
The Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap
  2020-09-30  4:46   ` Chris Goldsworthy
@ 2020-10-01 14:49     ` Chris Goldsworthy
  2020-10-01 18:28       ` John Stultz
  2020-10-01 22:07     ` John Stultz
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Goldsworthy @ 2020-10-01 14:49 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Sumit Semwal, Liam Mark, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Ørjan Eide, Robin Murphy, Ezequiel Garcia, Simon Ser,
	James Jones, linux-media, dri-devel

On 2020-09-29 21:46, Chris Goldsworthy wrote:
> On 2020-09-25 21:24, John Stultz wrote:
>> Reuse/abuse the pagepool code from the network code to speed
>> up allocation performance.
>> 
>> This is similar to the ION pagepool usage, but tries to
>> utilize generic code instead of a custom implementation.
>> 
>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>> Cc: Liam Mark <lmark@codeaurora.org>
>> Cc: Laura Abbott <labbott@kernel.org>
>> Cc: Brian Starkey <Brian.Starkey@arm.com>
>> Cc: Hridya Valsaraju <hridya@google.com>
>> Cc: Suren Baghdasaryan <surenb@google.com>
>> Cc: Sandeep Patil <sspatil@google.com>
>> Cc: Ørjan Eide <orjan.eide@arm.com>
>> Cc: Robin Murphy <robin.murphy@arm.com>
>> Cc: Ezequiel Garcia <ezequiel@collabora.com>
>> Cc: Simon Ser <contact@emersion.fr>
>> Cc: James Jones <jajones@nvidia.com>
>> Cc: linux-media@vger.kernel.org
>> Cc: dri-devel@lists.freedesktop.org
>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>> ---
>>  drivers/dma-buf/heaps/Kconfig       |  1 +
>>  drivers/dma-buf/heaps/system_heap.c | 32 
>> +++++++++++++++++++++++++----
>>  2 files changed, 29 insertions(+), 4 deletions(-)
>> 
>> diff --git a/drivers/dma-buf/heaps/Kconfig 
>> b/drivers/dma-buf/heaps/Kconfig
>> index a5eef06c4226..f13cde4321b1 100644
>> --- a/drivers/dma-buf/heaps/Kconfig
>> +++ b/drivers/dma-buf/heaps/Kconfig
>> @@ -1,6 +1,7 @@
>>  config DMABUF_HEAPS_SYSTEM
>>  	bool "DMA-BUF System Heap"
>>  	depends on DMABUF_HEAPS
>> +	select PAGE_POOL
>>  	help
>>  	  Choose this option to enable the system dmabuf heap. The system 
>> heap
>>  	  is backed by pages from the buddy allocator. If in doubt, say Y.
>> diff --git a/drivers/dma-buf/heaps/system_heap.c
>> b/drivers/dma-buf/heaps/system_heap.c
>> index 882a632e9bb7..9f57b4c8ae69 100644
>> --- a/drivers/dma-buf/heaps/system_heap.c
>> +++ b/drivers/dma-buf/heaps/system_heap.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/scatterlist.h>
>>  #include <linux/slab.h>
>>  #include <linux/vmalloc.h>
>> +#include <net/page_pool.h>
>> 
>>  struct dma_heap *sys_heap;
>> 
>> @@ -46,6 +47,7 @@ struct dma_heap_attachment {
>>  static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, 
>> LOW_ORDER_GFP};
>>  static const unsigned int orders[] = {8, 4, 0};
>>  #define NUM_ORDERS ARRAY_SIZE(orders)
>> +struct page_pool *pools[NUM_ORDERS];
>> 
>>  static struct sg_table *dup_sg_table(struct sg_table *table)
>>  {
>> @@ -264,13 +266,17 @@ static void system_heap_dma_buf_release(struct
>> dma_buf *dmabuf)
>>  	struct system_heap_buffer *buffer = dmabuf->priv;
>>  	struct sg_table *table;
>>  	struct scatterlist *sg;
>> -	int i;
>> +	int i, j;
>> 
>>  	table = &buffer->sg_table;
>>  	for_each_sg(table->sgl, sg, table->nents, i) {
>>  		struct page *page = sg_page(sg);
>> 
>> -		__free_pages(page, compound_order(page));
>> +		for (j = 0; j < NUM_ORDERS; j++) {
>> +			if (compound_order(page) == orders[j])
>> +				break;
>> +		}
>> +		page_pool_put_full_page(pools[j], page, false);
>>  	}
>>  	sg_free_table(table);
>>  	kfree(buffer);
>> @@ -300,8 +306,7 @@ static struct page
>> *alloc_largest_available(unsigned long size,
>>  			continue;
>>  		if (max_order < orders[i])
>>  			continue;
>> -
>> -		page = alloc_pages(order_flags[i], orders[i]);
>> +		page = page_pool_alloc_pages(pools[i], order_flags[i]);
>>  		if (!page)
>>  			continue;
>>  		return page;
>> @@ -406,6 +411,25 @@ static const struct dma_heap_ops system_heap_ops 
>> = {
>>  static int system_heap_create(void)
>>  {
>>  	struct dma_heap_export_info exp_info;
>> +	int i;
>> +
>> +	for (i = 0; i < NUM_ORDERS; i++) {
>> +		struct page_pool_params pp;
>> +
>> +		memset(&pp, 0, sizeof(pp));
>> +		pp.order = orders[i];
>> +		pp.dma_dir = DMA_BIDIRECTIONAL;

Hey John,

Correct me if I'm wrong, but I think that in order for pp.dma_dir to be 
used in either page_pool_alloc_pages() or page_pool_put_full_page(), we 
need to at least have PP_FLAG_DMA_MAP set (to have 
page_pool_dma_sync_for_device() called, PP_FLAG_DMA_SYNC_DEV should also 
be set I think).  I think you'd also need to to have pp->dev set.  Are 
we setting dma_dir with the intention of doing the necessary CMOs before 
we start using the page?

Thanks,

Chris.

>> +		pools[i] = page_pool_create(&pp);
>> +
>> +		if (IS_ERR(pools[i])) {
>> +			int j;
>> +
>> +			pr_err("%s: page pool creation failed!\n", __func__);
>> +			for (j = 0; j < i; j++)
>> +				page_pool_destroy(pools[j]);
>> +			return PTR_ERR(pools[i]);
>> +		}
>> +	}
>> 
>>  	exp_info.name = "system";
>>  	exp_info.ops = &system_heap_ops;
> 
> This is cool, I didn't know about this pooling code under /net/core.
> Nice and compact.

-- 
The Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap
  2020-10-01 14:49     ` Chris Goldsworthy
@ 2020-10-01 18:28       ` John Stultz
  0 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2020-10-01 18:28 UTC (permalink / raw)
  To: Chris Goldsworthy
  Cc: lkml, Sumit Semwal, Liam Mark, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Ørjan Eide, Robin Murphy, Ezequiel Garcia, Simon Ser,
	James Jones, linux-media, dri-devel

On Thu, Oct 1, 2020 at 7:49 AM Chris Goldsworthy
<cgoldswo@codeaurora.org> wrote:
> On 2020-09-29 21:46, Chris Goldsworthy wrote:
> > On 2020-09-25 21:24, John Stultz wrote:
> >> Reuse/abuse the pagepool code from the network code to speed
> >> up allocation performance.
> >>
> >> This is similar to the ION pagepool usage, but tries to
> >> utilize generic code instead of a custom implementation.
> >>
> >> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >> Cc: Liam Mark <lmark@codeaurora.org>
> >> Cc: Laura Abbott <labbott@kernel.org>
> >> Cc: Brian Starkey <Brian.Starkey@arm.com>
> >> Cc: Hridya Valsaraju <hridya@google.com>
> >> Cc: Suren Baghdasaryan <surenb@google.com>
> >> Cc: Sandeep Patil <sspatil@google.com>
> >> Cc: Ørjan Eide <orjan.eide@arm.com>
> >> Cc: Robin Murphy <robin.murphy@arm.com>
> >> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> >> Cc: Simon Ser <contact@emersion.fr>
> >> Cc: James Jones <jajones@nvidia.com>
> >> Cc: linux-media@vger.kernel.org
> >> Cc: dri-devel@lists.freedesktop.org
> >> Signed-off-by: John Stultz <john.stultz@linaro.org>
> >> ---
> >>  drivers/dma-buf/heaps/Kconfig       |  1 +
> >>  drivers/dma-buf/heaps/system_heap.c | 32
> >> +++++++++++++++++++++++++----
> >>  2 files changed, 29 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/dma-buf/heaps/Kconfig
> >> b/drivers/dma-buf/heaps/Kconfig
> >> index a5eef06c4226..f13cde4321b1 100644
> >> --- a/drivers/dma-buf/heaps/Kconfig
> >> +++ b/drivers/dma-buf/heaps/Kconfig
> >> @@ -1,6 +1,7 @@
> >>  config DMABUF_HEAPS_SYSTEM
> >>      bool "DMA-BUF System Heap"
> >>      depends on DMABUF_HEAPS
> >> +    select PAGE_POOL
> >>      help
> >>        Choose this option to enable the system dmabuf heap. The system
> >> heap
> >>        is backed by pages from the buddy allocator. If in doubt, say Y.
> >> diff --git a/drivers/dma-buf/heaps/system_heap.c
> >> b/drivers/dma-buf/heaps/system_heap.c
> >> index 882a632e9bb7..9f57b4c8ae69 100644
> >> --- a/drivers/dma-buf/heaps/system_heap.c
> >> +++ b/drivers/dma-buf/heaps/system_heap.c
> >> @@ -20,6 +20,7 @@
> >>  #include <linux/scatterlist.h>
> >>  #include <linux/slab.h>
> >>  #include <linux/vmalloc.h>
> >> +#include <net/page_pool.h>
> >>
> >>  struct dma_heap *sys_heap;
> >>
> >> @@ -46,6 +47,7 @@ struct dma_heap_attachment {
> >>  static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP,
> >> LOW_ORDER_GFP};
> >>  static const unsigned int orders[] = {8, 4, 0};
> >>  #define NUM_ORDERS ARRAY_SIZE(orders)
> >> +struct page_pool *pools[NUM_ORDERS];
> >>
> >>  static struct sg_table *dup_sg_table(struct sg_table *table)
> >>  {
> >> @@ -264,13 +266,17 @@ static void system_heap_dma_buf_release(struct
> >> dma_buf *dmabuf)
> >>      struct system_heap_buffer *buffer = dmabuf->priv;
> >>      struct sg_table *table;
> >>      struct scatterlist *sg;
> >> -    int i;
> >> +    int i, j;
> >>
> >>      table = &buffer->sg_table;
> >>      for_each_sg(table->sgl, sg, table->nents, i) {
> >>              struct page *page = sg_page(sg);
> >>
> >> -            __free_pages(page, compound_order(page));
> >> +            for (j = 0; j < NUM_ORDERS; j++) {
> >> +                    if (compound_order(page) == orders[j])
> >> +                            break;
> >> +            }
> >> +            page_pool_put_full_page(pools[j], page, false);
> >>      }
> >>      sg_free_table(table);
> >>      kfree(buffer);
> >> @@ -300,8 +306,7 @@ static struct page
> >> *alloc_largest_available(unsigned long size,
> >>                      continue;
> >>              if (max_order < orders[i])
> >>                      continue;
> >> -
> >> -            page = alloc_pages(order_flags[i], orders[i]);
> >> +            page = page_pool_alloc_pages(pools[i], order_flags[i]);
> >>              if (!page)
> >>                      continue;
> >>              return page;
> >> @@ -406,6 +411,25 @@ static const struct dma_heap_ops system_heap_ops
> >> = {
> >>  static int system_heap_create(void)
> >>  {
> >>      struct dma_heap_export_info exp_info;
> >> +    int i;
> >> +
> >> +    for (i = 0; i < NUM_ORDERS; i++) {
> >> +            struct page_pool_params pp;
> >> +
> >> +            memset(&pp, 0, sizeof(pp));
> >> +            pp.order = orders[i];
> >> +            pp.dma_dir = DMA_BIDIRECTIONAL;
>
> Hey John,
>
> Correct me if I'm wrong, but I think that in order for pp.dma_dir to be
> used in either page_pool_alloc_pages() or page_pool_put_full_page(), we
> need to at least have PP_FLAG_DMA_MAP set (to have
> page_pool_dma_sync_for_device() called, PP_FLAG_DMA_SYNC_DEV should also
> be set I think).  I think you'd also need to to have pp->dev set.  Are
> we setting dma_dir with the intention of doing the necessary CMOs before
> we start using the page?

Looking, I think my setting of the dma_dir there on the pool is
unnecessary (and as you point out, it doesn't have much effect as long
as the PP_FLAG_DMA_MAP isn't set).
I'm really only using the pagepool as a page cache, and the dmabuf ops
are still used for mapping and syncing operations.

thanks
-john

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap
  2020-09-30  4:46   ` Chris Goldsworthy
  2020-10-01 14:49     ` Chris Goldsworthy
@ 2020-10-01 22:07     ` John Stultz
  1 sibling, 0 replies; 11+ messages in thread
From: John Stultz @ 2020-10-01 22:07 UTC (permalink / raw)
  To: Chris Goldsworthy
  Cc: lkml, Sumit Semwal, Liam Mark, Laura Abbott, Brian Starkey,
	Hridya Valsaraju, Suren Baghdasaryan, Sandeep Patil,
	Ørjan Eide, Robin Murphy, Ezequiel Garcia, Simon Ser,
	James Jones, linux-media, dri-devel

On Tue, Sep 29, 2020 at 9:46 PM Chris Goldsworthy
<cgoldswo@codeaurora.org> wrote:
>
> On 2020-09-25 21:24, John Stultz wrote:
> > Reuse/abuse the pagepool code from the network code to speed
> > up allocation performance.
> >
> > This is similar to the ION pagepool usage, but tries to
> > utilize generic code instead of a custom implementation.
> >
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: Liam Mark <lmark@codeaurora.org>
> > Cc: Laura Abbott <labbott@kernel.org>
> > Cc: Brian Starkey <Brian.Starkey@arm.com>
> > Cc: Hridya Valsaraju <hridya@google.com>
> > Cc: Suren Baghdasaryan <surenb@google.com>
> > Cc: Sandeep Patil <sspatil@google.com>
> > Cc: Ørjan Eide <orjan.eide@arm.com>
> > Cc: Robin Murphy <robin.murphy@arm.com>
> > Cc: Ezequiel Garcia <ezequiel@collabora.com>
> > Cc: Simon Ser <contact@emersion.fr>
> > Cc: James Jones <jajones@nvidia.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: dri-devel@lists.freedesktop.org
> > Signed-off-by: John Stultz <john.stultz@linaro.org>
> > ---
> >  drivers/dma-buf/heaps/Kconfig       |  1 +
> >  drivers/dma-buf/heaps/system_heap.c | 32 +++++++++++++++++++++++++----
> >  2 files changed, 29 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/dma-buf/heaps/Kconfig
> > b/drivers/dma-buf/heaps/Kconfig
> > index a5eef06c4226..f13cde4321b1 100644
> > --- a/drivers/dma-buf/heaps/Kconfig
> > +++ b/drivers/dma-buf/heaps/Kconfig
> > @@ -1,6 +1,7 @@
> >  config DMABUF_HEAPS_SYSTEM
> >       bool "DMA-BUF System Heap"
> >       depends on DMABUF_HEAPS
> > +     select PAGE_POOL
> >       help
> >         Choose this option to enable the system dmabuf heap. The system
> > heap
> >         is backed by pages from the buddy allocator. If in doubt, say Y.
> > diff --git a/drivers/dma-buf/heaps/system_heap.c
> > b/drivers/dma-buf/heaps/system_heap.c
> > index 882a632e9bb7..9f57b4c8ae69 100644
> > --- a/drivers/dma-buf/heaps/system_heap.c
> > +++ b/drivers/dma-buf/heaps/system_heap.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/scatterlist.h>
> >  #include <linux/slab.h>
> >  #include <linux/vmalloc.h>
> > +#include <net/page_pool.h>
> >
> >  struct dma_heap *sys_heap;
> >
> > @@ -46,6 +47,7 @@ struct dma_heap_attachment {
> >  static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP,
> > LOW_ORDER_GFP};
> >  static const unsigned int orders[] = {8, 4, 0};
> >  #define NUM_ORDERS ARRAY_SIZE(orders)
> > +struct page_pool *pools[NUM_ORDERS];
> >
> >  static struct sg_table *dup_sg_table(struct sg_table *table)
> >  {
> > @@ -264,13 +266,17 @@ static void system_heap_dma_buf_release(struct
> > dma_buf *dmabuf)
> >       struct system_heap_buffer *buffer = dmabuf->priv;
> >       struct sg_table *table;
> >       struct scatterlist *sg;
> > -     int i;
> > +     int i, j;
> >
> >       table = &buffer->sg_table;
> >       for_each_sg(table->sgl, sg, table->nents, i) {
> >               struct page *page = sg_page(sg);
> >
> > -             __free_pages(page, compound_order(page));
> > +             for (j = 0; j < NUM_ORDERS; j++) {
> > +                     if (compound_order(page) == orders[j])
> > +                             break;
> > +             }
> > +             page_pool_put_full_page(pools[j], page, false);
> >       }
> >       sg_free_table(table);
> >       kfree(buffer);
> > @@ -300,8 +306,7 @@ static struct page
> > *alloc_largest_available(unsigned long size,
> >                       continue;
> >               if (max_order < orders[i])
> >                       continue;
> > -
> > -             page = alloc_pages(order_flags[i], orders[i]);
> > +             page = page_pool_alloc_pages(pools[i], order_flags[i]);
> >               if (!page)
> >                       continue;
> >               return page;
> > @@ -406,6 +411,25 @@ static const struct dma_heap_ops system_heap_ops =
> > {
> >  static int system_heap_create(void)
> >  {
> >       struct dma_heap_export_info exp_info;
> > +     int i;
> > +
> > +     for (i = 0; i < NUM_ORDERS; i++) {
> > +             struct page_pool_params pp;
> > +
> > +             memset(&pp, 0, sizeof(pp));
> > +             pp.order = orders[i];
> > +             pp.dma_dir = DMA_BIDIRECTIONAL;
> > +             pools[i] = page_pool_create(&pp);
> > +
> > +             if (IS_ERR(pools[i])) {
> > +                     int j;
> > +
> > +                     pr_err("%s: page pool creation failed!\n", __func__);
> > +                     for (j = 0; j < i; j++)
> > +                             page_pool_destroy(pools[j]);
> > +                     return PTR_ERR(pools[i]);
> > +             }
> > +     }
> >
> >       exp_info.name = "system";
> >       exp_info.ops = &system_heap_ops;
>
> This is cool, I didn't know about this pooling code under /net/core.
> Nice and compact.

Oh, bummer. I just realized when allocating w/ __GFP_ZERO from the
page-pool, the logic doesn't actually clear pages when pulling from
the cache.
So unfortunately this is what accounts for much of the performance
benefit I was seeing with this approach, so I'll have to retract my
claim on the performance gain with this. :(

I've got a first pass at zeroing the pages we put into the pool, but
the numbers are not so great just yet so I've got some further work to
do.

thanks
-john

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-10-01 22:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-26  4:24 [RFC][PATCH 0/6] dma-buf: Performance improvements for system heap John Stultz
2020-09-26  4:24 ` [RFC][PATCH 1/6] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists John Stultz
2020-09-26  4:24 ` [RFC][PATCH 2/6] dma-buf: heaps: Move heap-helper logic into the cma_heap implementation John Stultz
2020-09-26  4:24 ` [RFC][PATCH 3/6] dma-buf: heaps: Remove heap-helpers code John Stultz
2020-09-26  4:24 ` [RFC][PATCH 4/6] dma-buf: system_heap: Allocate higher order pages if available John Stultz
2020-09-26  4:24 ` [RFC][PATCH 5/6] dma-buf: system_heap: Add pagepool support to system heap John Stultz
2020-09-30  4:46   ` Chris Goldsworthy
2020-10-01 14:49     ` Chris Goldsworthy
2020-10-01 18:28       ` John Stultz
2020-10-01 22:07     ` John Stultz
2020-09-26  4:24 ` [RFC][PATCH 6/6] dma-buf: heaps: Skip sync if not mapped John Stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).