All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Starkey <brian.starkey@arm.com>
To: John Stultz <john.stultz@linaro.org>
Cc: lkml <linux-kernel@vger.kernel.org>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Liam Mark" <lmark@codeaurora.org>,
	"Laura Abbott" <labbott@kernel.org>,
	"Hridya Valsaraju" <hridya@google.com>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Sandeep Patil" <sspatil@google.com>,
	"Daniel Mentz" <danielmentz@google.com>,
	"Chris Goldsworthy" <cgoldswo@codeaurora.org>,
	"Ørjan Eide" <orjan.eide@arm.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Ezequiel Garcia" <ezequiel@collabora.com>,
	"Simon Ser" <contact@emersion.fr>,
	"James Jones" <jajones@nvidia.com>,
	linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
	nd@arm.com
Subject: Re: [PATCH v3 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap
Date: Thu, 8 Oct 2020 12:51:01 +0100	[thread overview]
Message-ID: <20201008115101.4qi6wh3hhkb6krg5@DESKTOP-E1NTVVP.localdomain> (raw)
In-Reply-To: <20201003040257.62768-8-john.stultz@linaro.org>

On Sat, Oct 03, 2020 at 04:02:57AM +0000, John Stultz wrote:
> This adds a heap that allocates non-contiguous buffers that are
> marked as writecombined, so they are not cached by the CPU.
> 
> This is useful, as most graphics buffers are usually not touched
> by the CPU or only written into once by the CPU. So when mapping
> the buffer over and over between devices, we can skip the CPU
> syncing, which saves a lot of cache management overhead, greatly
> improving performance.
> 
> For folk using ION, there was a ION_FLAG_CACHED flag, which
> signaled if the returned buffer should be CPU cacheable or not.
> With DMA-BUF heaps, we do not yet have such a flag, and by default
> the current heaps (system and cma) produce CPU cachable buffers.
> So for folks transitioning from ION to DMA-BUF Heaps, this fills
> in some of that missing functionality.
> 
> There has been a suggestion to make this functionality a flag
> (DMAHEAP_FLAG_UNCACHED?) on the system heap, similar to how
> ION used the ION_FLAG_CACHED. But I want to make sure an
> _UNCACHED flag would truely be a generic attribute across all
> heaps. So far that has been unclear, so having it as a separate
> heap seemes better for now. (But I'm open to discussion on this
> point!)
> 
> This is a rework of earlier efforts to add a uncached system heap,
> done utilizing the exisitng system heap, adding just a bit of
> logic to handle the uncached case.
> 
> Feedback would be very welcome!
> 
> Many thanks to Liam Mark for his help to get this working.
> 
> Pending opensource users of this code include:
> * AOSP HiKey960 gralloc:
>   - https://android-review.googlesource.com/c/device/linaro/hikey/+/1399519
>   - Visibly improves performance over the system heap
> * AOSP Codec2 (possibly, needs more review):
>   - https://android-review.googlesource.com/c/platform/frameworks/av/+/1360640/17/media/codec2/vndk/C2DmaBufAllocator.cpp#325
> 
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Liam Mark <lmark@codeaurora.org>
> Cc: Laura Abbott <labbott@kernel.org>
> Cc: Brian Starkey <Brian.Starkey@arm.com>
> Cc: Hridya Valsaraju <hridya@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Sandeep Patil <sspatil@google.com>
> Cc: Daniel Mentz <danielmentz@google.com>
> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
> Cc: �rjan Eide <orjan.eide@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: James Jones <jajones@nvidia.com>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>  drivers/dma-buf/heaps/system_heap.c | 87 ++++++++++++++++++++++++++---
>  1 file changed, 79 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
> index 2b8d4b6abacb..952f1fd9dacf 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -22,6 +22,7 @@
>  #include <linux/vmalloc.h>
>  
>  struct dma_heap *sys_heap;
> +struct dma_heap *sys_uncached_heap;
>  
>  struct system_heap_buffer {
>  	struct dma_heap *heap;
> @@ -31,6 +32,8 @@ struct system_heap_buffer {
>  	struct sg_table sg_table;
>  	int vmap_cnt;
>  	void *vaddr;
> +
> +	bool uncached;
>  };
>  
>  struct dma_heap_attachment {
> @@ -38,6 +41,8 @@ struct dma_heap_attachment {
>  	struct sg_table *table;
>  	struct list_head list;
>  	bool mapped;
> +
> +	bool uncached;
>  };
>  
>  #define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
> @@ -94,7 +99,7 @@ static int system_heap_attach(struct dma_buf *dmabuf,
>  	a->dev = attachment->dev;
>  	INIT_LIST_HEAD(&a->list);
>  	a->mapped = false;
> -
> +	a->uncached = buffer->uncached;
>  	attachment->priv = a;
>  
>  	mutex_lock(&buffer->lock);
> @@ -124,9 +129,13 @@ static struct sg_table *system_heap_map_dma_buf(struct dma_buf_attachment *attac
>  {
>  	struct dma_heap_attachment *a = attachment->priv;
>  	struct sg_table *table = a->table;
> +	int attr = 0;
>  	int ret;
>  
> -	ret = dma_map_sgtable(attachment->dev, table, direction, 0);
> +	if (a->uncached)
> +		attr = DMA_ATTR_SKIP_CPU_SYNC;
> +
> +	ret = dma_map_sgtable(attachment->dev, table, direction, attr);
>  	if (ret)
>  		return ERR_PTR(ret);
>  
> @@ -139,9 +148,12 @@ static void system_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
>  				      enum dma_data_direction direction)
>  {
>  	struct dma_heap_attachment *a = attachment->priv;
> +	int attr = 0;
>  
> +	if (a->uncached)
> +		attr = DMA_ATTR_SKIP_CPU_SYNC;
>  	a->mapped = false;
> -	dma_unmap_sgtable(attachment->dev, table, direction, 0);
> +	dma_unmap_sgtable(attachment->dev, table, direction, attr);
>  }
>  
>  static int system_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> @@ -150,6 +162,9 @@ static int system_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>  	struct system_heap_buffer *buffer = dmabuf->priv;
>  	struct dma_heap_attachment *a;
>  
> +	if (buffer->uncached)
> +		return 0;
> +
>  	mutex_lock(&buffer->lock);
>  
>  	if (buffer->vmap_cnt)
> @@ -171,6 +186,9 @@ static int system_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
>  	struct system_heap_buffer *buffer = dmabuf->priv;
>  	struct dma_heap_attachment *a;
>  
> +	if (buffer->uncached)
> +		return 0;
> +
>  	mutex_lock(&buffer->lock);
>  
>  	if (buffer->vmap_cnt)
> @@ -194,6 +212,9 @@ static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
>  	struct sg_page_iter piter;
>  	int ret;
>  
> +	if (buffer->uncached)
> +		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
> +
>  	for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
>  		struct page *page = sg_page_iter_page(&piter);
>  
> @@ -215,8 +236,12 @@ static void *system_heap_do_vmap(struct system_heap_buffer *buffer)
>  	struct page **pages = vmalloc(sizeof(struct page *) * npages);
>  	struct page **tmp = pages;
>  	struct sg_page_iter piter;
> +	pgprot_t pgprot = PAGE_KERNEL;
>  	void *vaddr;
>  
> +	if (buffer->uncached)
> +		pgprot = pgprot_writecombine(PAGE_KERNEL);

I think this should go after the 'if (!pages)' check. Best to get the
allocation failure check as close to the allocation as possible IMO.

> +
>  	if (!pages)
>  		return ERR_PTR(-ENOMEM);
>  
> @@ -225,7 +250,7 @@ static void *system_heap_do_vmap(struct system_heap_buffer *buffer)
>  		*tmp++ = sg_page_iter_page(&piter);
>  	}
>  
> -	vaddr = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
> +	vaddr = vmap(pages, npages, VM_MAP, pgprot);
>  	vfree(pages);
>  
>  	if (!vaddr)
> @@ -278,6 +303,10 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
>  	int i;
>  
>  	table = &buffer->sg_table;
> +	/* Unmap the uncached buffers from the heap device (pairs with the map on allocate) */
> +	if (buffer->uncached)
> +		dma_unmap_sgtable(dma_heap_get_dev(buffer->heap), table, DMA_BIDIRECTIONAL, 0);
> +
>  	for_each_sg(table->sgl, sg, table->nents, i) {
>  		struct page *page = sg_page(sg);
>  
> @@ -320,10 +349,11 @@ static struct page *alloc_largest_available(unsigned long size,
>  	return NULL;
>  }
>  
> -static int system_heap_allocate(struct dma_heap *heap,
> -				unsigned long len,
> -				unsigned long fd_flags,
> -				unsigned long heap_flags)
> +static int system_heap_do_allocate(struct dma_heap *heap,
> +				   unsigned long len,
> +				   unsigned long fd_flags,
> +				   unsigned long heap_flags,
> +				   bool uncached)
>  {
>  	struct system_heap_buffer *buffer;
>  	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
> @@ -344,6 +374,7 @@ static int system_heap_allocate(struct dma_heap *heap,
>  	mutex_init(&buffer->lock);
>  	buffer->heap = heap;
>  	buffer->len = len;
> +	buffer->uncached = uncached;
>  
>  	INIT_LIST_HEAD(&pages);
>  	i = 0;
> @@ -393,6 +424,16 @@ static int system_heap_allocate(struct dma_heap *heap,
>  		/* just return, as put will call release and that will free */
>  		return ret;
>  	}
> +
> +	/*
> +	 * For uncached buffers, we need to initially flush cpu cache, since
> +	 * the __GFP_ZERO on the allocation means the zeroing was done by the
> +	 * cpu and thus it is likely cached. Map (and implicitly flush) it out
> +	 * now so we don't get corruption later on.
> +	 */
> +	if (buffer->uncached)
> +		dma_map_sgtable(dma_heap_get_dev(heap), table, DMA_BIDIRECTIONAL, 0);

Do we have to keep this mapping around for the entire lifetime of the
buffer?

Also, this problem (and solution) keeps lingering around. It really
feels like there should be a better way to solve "clean the linear
mapping all the way to DRAM", but I don't know what that should be.

> +
>  	return ret;
>  
>  free_pages:
> @@ -410,10 +451,30 @@ static int system_heap_allocate(struct dma_heap *heap,
>  	return ret;
>  }
>  
> +static int system_heap_allocate(struct dma_heap *heap,
> +				unsigned long len,
> +				unsigned long fd_flags,
> +				unsigned long heap_flags)
> +{
> +	return system_heap_do_allocate(heap, len, fd_flags, heap_flags, false);
> +}
> +
>  static const struct dma_heap_ops system_heap_ops = {
>  	.allocate = system_heap_allocate,
>  };
>  
> +static int system_uncached_heap_allocate(struct dma_heap *heap,
> +					 unsigned long len,
> +					 unsigned long fd_flags,
> +					 unsigned long heap_flags)
> +{
> +	return system_heap_do_allocate(heap, len, fd_flags, heap_flags, true);
> +}
> +
> +static const struct dma_heap_ops system_uncached_heap_ops = {
> +	.allocate = system_uncached_heap_allocate,
> +};
> +
>  static int system_heap_create(void)
>  {
>  	struct dma_heap_export_info exp_info;
> @@ -426,6 +487,16 @@ static int system_heap_create(void)
>  	if (IS_ERR(sys_heap))
>  		return PTR_ERR(sys_heap);
>  
> +	exp_info.name = "system-uncached";
> +	exp_info.ops = &system_uncached_heap_ops;
> +	exp_info.priv = NULL;
> +
> +	sys_uncached_heap = dma_heap_add(&exp_info);
> +	if (IS_ERR(sys_uncached_heap))
> +		return PTR_ERR(sys_heap);
> +

In principle, there's a race here between the heap getting registered
to sysfs and the dma_mask getting updated.

I don't think it would cause a problem in practice, but with the API
as it is, there's no way to avoid it - so I wonder if the
dma_heap_get_dev() accessor isn't the right API design.

Cheers,
-Brian

> +	dma_coerce_mask_and_coherent(dma_heap_get_dev(sys_uncached_heap), DMA_BIT_MASK(64));
> +
>  	return 0;
>  }
>  module_init(system_heap_create);
> -- 
> 2.17.1
> 

WARNING: multiple messages have this Message-ID (diff)
From: Brian Starkey <brian.starkey@arm.com>
To: John Stultz <john.stultz@linaro.org>
Cc: nd@arm.com, "Sandeep Patil" <sspatil@google.com>,
	dri-devel@lists.freedesktop.org,
	"Ezequiel Garcia" <ezequiel@collabora.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"James Jones" <jajones@nvidia.com>,
	lkml <linux-kernel@vger.kernel.org>,
	"Liam Mark" <lmark@codeaurora.org>,
	"Laura Abbott" <labbott@kernel.org>,
	"Chris Goldsworthy" <cgoldswo@codeaurora.org>,
	"Hridya Valsaraju" <hridya@google.com>,
	"Ørjan Eide" <orjan.eide@arm.com>,
	linux-media@vger.kernel.org,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Daniel Mentz" <danielmentz@google.com>
Subject: Re: [PATCH v3 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap
Date: Thu, 8 Oct 2020 12:51:01 +0100	[thread overview]
Message-ID: <20201008115101.4qi6wh3hhkb6krg5@DESKTOP-E1NTVVP.localdomain> (raw)
In-Reply-To: <20201003040257.62768-8-john.stultz@linaro.org>

On Sat, Oct 03, 2020 at 04:02:57AM +0000, John Stultz wrote:
> This adds a heap that allocates non-contiguous buffers that are
> marked as writecombined, so they are not cached by the CPU.
> 
> This is useful, as most graphics buffers are usually not touched
> by the CPU or only written into once by the CPU. So when mapping
> the buffer over and over between devices, we can skip the CPU
> syncing, which saves a lot of cache management overhead, greatly
> improving performance.
> 
> For folk using ION, there was a ION_FLAG_CACHED flag, which
> signaled if the returned buffer should be CPU cacheable or not.
> With DMA-BUF heaps, we do not yet have such a flag, and by default
> the current heaps (system and cma) produce CPU cachable buffers.
> So for folks transitioning from ION to DMA-BUF Heaps, this fills
> in some of that missing functionality.
> 
> There has been a suggestion to make this functionality a flag
> (DMAHEAP_FLAG_UNCACHED?) on the system heap, similar to how
> ION used the ION_FLAG_CACHED. But I want to make sure an
> _UNCACHED flag would truely be a generic attribute across all
> heaps. So far that has been unclear, so having it as a separate
> heap seemes better for now. (But I'm open to discussion on this
> point!)
> 
> This is a rework of earlier efforts to add a uncached system heap,
> done utilizing the exisitng system heap, adding just a bit of
> logic to handle the uncached case.
> 
> Feedback would be very welcome!
> 
> Many thanks to Liam Mark for his help to get this working.
> 
> Pending opensource users of this code include:
> * AOSP HiKey960 gralloc:
>   - https://android-review.googlesource.com/c/device/linaro/hikey/+/1399519
>   - Visibly improves performance over the system heap
> * AOSP Codec2 (possibly, needs more review):
>   - https://android-review.googlesource.com/c/platform/frameworks/av/+/1360640/17/media/codec2/vndk/C2DmaBufAllocator.cpp#325
> 
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Liam Mark <lmark@codeaurora.org>
> Cc: Laura Abbott <labbott@kernel.org>
> Cc: Brian Starkey <Brian.Starkey@arm.com>
> Cc: Hridya Valsaraju <hridya@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Sandeep Patil <sspatil@google.com>
> Cc: Daniel Mentz <danielmentz@google.com>
> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
> Cc: �rjan Eide <orjan.eide@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: James Jones <jajones@nvidia.com>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>  drivers/dma-buf/heaps/system_heap.c | 87 ++++++++++++++++++++++++++---
>  1 file changed, 79 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
> index 2b8d4b6abacb..952f1fd9dacf 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -22,6 +22,7 @@
>  #include <linux/vmalloc.h>
>  
>  struct dma_heap *sys_heap;
> +struct dma_heap *sys_uncached_heap;
>  
>  struct system_heap_buffer {
>  	struct dma_heap *heap;
> @@ -31,6 +32,8 @@ struct system_heap_buffer {
>  	struct sg_table sg_table;
>  	int vmap_cnt;
>  	void *vaddr;
> +
> +	bool uncached;
>  };
>  
>  struct dma_heap_attachment {
> @@ -38,6 +41,8 @@ struct dma_heap_attachment {
>  	struct sg_table *table;
>  	struct list_head list;
>  	bool mapped;
> +
> +	bool uncached;
>  };
>  
>  #define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
> @@ -94,7 +99,7 @@ static int system_heap_attach(struct dma_buf *dmabuf,
>  	a->dev = attachment->dev;
>  	INIT_LIST_HEAD(&a->list);
>  	a->mapped = false;
> -
> +	a->uncached = buffer->uncached;
>  	attachment->priv = a;
>  
>  	mutex_lock(&buffer->lock);
> @@ -124,9 +129,13 @@ static struct sg_table *system_heap_map_dma_buf(struct dma_buf_attachment *attac
>  {
>  	struct dma_heap_attachment *a = attachment->priv;
>  	struct sg_table *table = a->table;
> +	int attr = 0;
>  	int ret;
>  
> -	ret = dma_map_sgtable(attachment->dev, table, direction, 0);
> +	if (a->uncached)
> +		attr = DMA_ATTR_SKIP_CPU_SYNC;
> +
> +	ret = dma_map_sgtable(attachment->dev, table, direction, attr);
>  	if (ret)
>  		return ERR_PTR(ret);
>  
> @@ -139,9 +148,12 @@ static void system_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
>  				      enum dma_data_direction direction)
>  {
>  	struct dma_heap_attachment *a = attachment->priv;
> +	int attr = 0;
>  
> +	if (a->uncached)
> +		attr = DMA_ATTR_SKIP_CPU_SYNC;
>  	a->mapped = false;
> -	dma_unmap_sgtable(attachment->dev, table, direction, 0);
> +	dma_unmap_sgtable(attachment->dev, table, direction, attr);
>  }
>  
>  static int system_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> @@ -150,6 +162,9 @@ static int system_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>  	struct system_heap_buffer *buffer = dmabuf->priv;
>  	struct dma_heap_attachment *a;
>  
> +	if (buffer->uncached)
> +		return 0;
> +
>  	mutex_lock(&buffer->lock);
>  
>  	if (buffer->vmap_cnt)
> @@ -171,6 +186,9 @@ static int system_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
>  	struct system_heap_buffer *buffer = dmabuf->priv;
>  	struct dma_heap_attachment *a;
>  
> +	if (buffer->uncached)
> +		return 0;
> +
>  	mutex_lock(&buffer->lock);
>  
>  	if (buffer->vmap_cnt)
> @@ -194,6 +212,9 @@ static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
>  	struct sg_page_iter piter;
>  	int ret;
>  
> +	if (buffer->uncached)
> +		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
> +
>  	for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
>  		struct page *page = sg_page_iter_page(&piter);
>  
> @@ -215,8 +236,12 @@ static void *system_heap_do_vmap(struct system_heap_buffer *buffer)
>  	struct page **pages = vmalloc(sizeof(struct page *) * npages);
>  	struct page **tmp = pages;
>  	struct sg_page_iter piter;
> +	pgprot_t pgprot = PAGE_KERNEL;
>  	void *vaddr;
>  
> +	if (buffer->uncached)
> +		pgprot = pgprot_writecombine(PAGE_KERNEL);

I think this should go after the 'if (!pages)' check. Best to get the
allocation failure check as close to the allocation as possible IMO.

> +
>  	if (!pages)
>  		return ERR_PTR(-ENOMEM);
>  
> @@ -225,7 +250,7 @@ static void *system_heap_do_vmap(struct system_heap_buffer *buffer)
>  		*tmp++ = sg_page_iter_page(&piter);
>  	}
>  
> -	vaddr = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
> +	vaddr = vmap(pages, npages, VM_MAP, pgprot);
>  	vfree(pages);
>  
>  	if (!vaddr)
> @@ -278,6 +303,10 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
>  	int i;
>  
>  	table = &buffer->sg_table;
> +	/* Unmap the uncached buffers from the heap device (pairs with the map on allocate) */
> +	if (buffer->uncached)
> +		dma_unmap_sgtable(dma_heap_get_dev(buffer->heap), table, DMA_BIDIRECTIONAL, 0);
> +
>  	for_each_sg(table->sgl, sg, table->nents, i) {
>  		struct page *page = sg_page(sg);
>  
> @@ -320,10 +349,11 @@ static struct page *alloc_largest_available(unsigned long size,
>  	return NULL;
>  }
>  
> -static int system_heap_allocate(struct dma_heap *heap,
> -				unsigned long len,
> -				unsigned long fd_flags,
> -				unsigned long heap_flags)
> +static int system_heap_do_allocate(struct dma_heap *heap,
> +				   unsigned long len,
> +				   unsigned long fd_flags,
> +				   unsigned long heap_flags,
> +				   bool uncached)
>  {
>  	struct system_heap_buffer *buffer;
>  	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
> @@ -344,6 +374,7 @@ static int system_heap_allocate(struct dma_heap *heap,
>  	mutex_init(&buffer->lock);
>  	buffer->heap = heap;
>  	buffer->len = len;
> +	buffer->uncached = uncached;
>  
>  	INIT_LIST_HEAD(&pages);
>  	i = 0;
> @@ -393,6 +424,16 @@ static int system_heap_allocate(struct dma_heap *heap,
>  		/* just return, as put will call release and that will free */
>  		return ret;
>  	}
> +
> +	/*
> +	 * For uncached buffers, we need to initially flush cpu cache, since
> +	 * the __GFP_ZERO on the allocation means the zeroing was done by the
> +	 * cpu and thus it is likely cached. Map (and implicitly flush) it out
> +	 * now so we don't get corruption later on.
> +	 */
> +	if (buffer->uncached)
> +		dma_map_sgtable(dma_heap_get_dev(heap), table, DMA_BIDIRECTIONAL, 0);

Do we have to keep this mapping around for the entire lifetime of the
buffer?

Also, this problem (and solution) keeps lingering around. It really
feels like there should be a better way to solve "clean the linear
mapping all the way to DRAM", but I don't know what that should be.

> +
>  	return ret;
>  
>  free_pages:
> @@ -410,10 +451,30 @@ static int system_heap_allocate(struct dma_heap *heap,
>  	return ret;
>  }
>  
> +static int system_heap_allocate(struct dma_heap *heap,
> +				unsigned long len,
> +				unsigned long fd_flags,
> +				unsigned long heap_flags)
> +{
> +	return system_heap_do_allocate(heap, len, fd_flags, heap_flags, false);
> +}
> +
>  static const struct dma_heap_ops system_heap_ops = {
>  	.allocate = system_heap_allocate,
>  };
>  
> +static int system_uncached_heap_allocate(struct dma_heap *heap,
> +					 unsigned long len,
> +					 unsigned long fd_flags,
> +					 unsigned long heap_flags)
> +{
> +	return system_heap_do_allocate(heap, len, fd_flags, heap_flags, true);
> +}
> +
> +static const struct dma_heap_ops system_uncached_heap_ops = {
> +	.allocate = system_uncached_heap_allocate,
> +};
> +
>  static int system_heap_create(void)
>  {
>  	struct dma_heap_export_info exp_info;
> @@ -426,6 +487,16 @@ static int system_heap_create(void)
>  	if (IS_ERR(sys_heap))
>  		return PTR_ERR(sys_heap);
>  
> +	exp_info.name = "system-uncached";
> +	exp_info.ops = &system_uncached_heap_ops;
> +	exp_info.priv = NULL;
> +
> +	sys_uncached_heap = dma_heap_add(&exp_info);
> +	if (IS_ERR(sys_uncached_heap))
> +		return PTR_ERR(sys_heap);
> +

In principle, there's a race here between the heap getting registered
to sysfs and the dma_mask getting updated.

I don't think it would cause a problem in practice, but with the API
as it is, there's no way to avoid it - so I wonder if the
dma_heap_get_dev() accessor isn't the right API design.

Cheers,
-Brian

> +	dma_coerce_mask_and_coherent(dma_heap_get_dev(sys_uncached_heap), DMA_BIT_MASK(64));
> +
>  	return 0;
>  }
>  module_init(system_heap_create);
> -- 
> 2.17.1
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2020-10-08 11:51 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-03  4:02 [PATCH v3 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation John Stultz
2020-10-03  4:02 ` John Stultz
2020-10-03  4:02 ` [PATCH v3 1/7] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists John Stultz
2020-10-03  4:02   ` John Stultz
2020-10-03  4:02 ` [PATCH v3 2/7] dma-buf: heaps: Move heap-helper logic into the cma_heap implementation John Stultz
2020-10-03  4:02   ` John Stultz
2020-10-03  4:02 ` [PATCH v3 3/7] dma-buf: heaps: Remove heap-helpers code John Stultz
2020-10-03  4:02   ` John Stultz
2020-10-03  4:02 ` [PATCH v3 4/7] dma-buf: heaps: Skip sync if not mapped John Stultz
2020-10-03  4:02   ` John Stultz
2020-10-03  4:02 ` [PATCH v3 5/7] dma-buf: system_heap: Allocate higher order pages if available John Stultz
2020-10-03  4:02   ` John Stultz
2020-10-03  4:02 ` [PATCH v3 6/7] dma-buf: dma-heap: Keep track of the heap device struct John Stultz
2020-10-03  4:02   ` John Stultz
2020-10-03  4:02 ` [PATCH v3 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap John Stultz
2020-10-03  4:02   ` John Stultz
2020-10-05 13:45   ` Christoph Hellwig
2020-10-08  5:03     ` John Stultz
2020-10-08  5:03       ` John Stultz
2020-10-05 21:21   ` kernel test robot
2020-10-05 21:21     ` kernel test robot
2020-10-05 21:21   ` [RFC PATCH] dma-buf: system_heap: sys_uncached_heap can be static kernel test robot
2020-10-05 21:21     ` kernel test robot
2020-10-07  7:43   ` [PATCH v3 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap Dan Carpenter
2020-10-07  7:43     ` Dan Carpenter
2020-10-07  7:43     ` Dan Carpenter
2020-10-08  5:38     ` John Stultz
2020-10-08  5:38       ` John Stultz
2020-10-08 11:51   ` Brian Starkey [this message]
2020-10-08 11:51     ` Brian Starkey
2020-10-16 19:03     ` John Stultz
2020-10-16 19:03       ` John Stultz
2020-10-16 23:15       ` John Stultz
2020-10-16 23:15         ` John Stultz
2021-01-29  1:23       ` Daniel Mentz
2021-01-29  1:23         ` Daniel Mentz
2020-10-08 11:36 ` [PATCH v3 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation Brian Starkey
2020-10-08 11:36   ` Brian Starkey
2020-10-16 18:47   ` John Stultz
2020-10-16 18:47     ` John Stultz
2020-10-03  8:24 [PATCH v3 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap kernel test robot
2020-10-03 17:41 kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201008115101.4qi6wh3hhkb6krg5@DESKTOP-E1NTVVP.localdomain \
    --to=brian.starkey@arm.com \
    --cc=cgoldswo@codeaurora.org \
    --cc=contact@emersion.fr \
    --cc=danielmentz@google.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ezequiel@collabora.com \
    --cc=hridya@google.com \
    --cc=jajones@nvidia.com \
    --cc=john.stultz@linaro.org \
    --cc=labbott@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=lmark@codeaurora.org \
    --cc=nd@arm.com \
    --cc=orjan.eide@arm.com \
    --cc=robin.murphy@arm.com \
    --cc=sspatil@google.com \
    --cc=sumit.semwal@linaro.org \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.