dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [RESEND][PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation
@ 2020-10-29  0:16 John Stultz
  2020-10-29  0:16 ` [PATCH v4 1/7] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists John Stultz
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: John Stultz @ 2020-10-29  0:16 UTC (permalink / raw)
  To: lkml
  Cc: Sandeep Patil, dri-devel, Ezequiel Garcia, Robin Murphy,
	James Jones, Liam Mark, Laura Abbott, Chris Goldsworthy,
	Hridya Valsaraju, Ørjan Eide, linux-media,
	Suren Baghdasaryan, Daniel Mentz

Hey All,
  So just wanted to resend my last revision of my patch series
of performance optimizations to the dma-buf system heap.

This series reworks the system heap to use sgtables, and then
consolidates the pagelist method from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked. As more heaps show up I
think we'll have a better idea how to best share code, so for
now I think this is ok.

After this, the series introduces an optimization that
Ørjan Eide implemented for ION that avoids calling sync on
attachments that don't have a mapping.

Next, an optimization to use larger order pages for the system
heap. This change brings us closer to the current performance
of the ION allocation code (though there still is a gap due
to ION using a mix of deferred-freeing and page pools, I'll be
looking at integrating those eventually).

Finally, a reworked version of my uncached system heap
implementation I was submitting a few weeks back. Since it
duplicated a lot of the now reworked system heap code, I
realized it would be much simpler to add the functionality to
the system_heap implementation itself.

While not improving the core allocation performance, the
uncached heap allocations do result in *much* improved
performance on HiKey960 as it avoids a lot of flushing and
invalidating buffers that the cpu doesn't touch often.

Feedback on these would be great!

thanks
-john

New in v4:
* Make sys_heap static (indirectly) Reported-by:
     kernel test robot <lkp@intel.com>
* Spelling fixes suggested by BrianS
* Make sys_uncached_heap static, as
    Reported-by: kernel test robot <lkp@intel.com>
* Fix wrong return value, caught by smatch
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
* Ensure we call flush/invalidate_kernel_vmap_range() in the
  uncached cases to try to address feedback about VIVT caches
  from Christoph
* Reorder a few lines as suggested by BrianS
* Avoid holding the initial mapping for the lifetime of the buffer
  as suggested by BrianS
* Fix a unlikely race between allocate and updating the dma_mask
  that BrianS noticed.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org

John Stultz (7):
  dma-buf: system_heap: Rework system heap to use sgtables instead of
    pagelists
  dma-buf: heaps: Move heap-helper logic into the cma_heap
    implementation
  dma-buf: heaps: Remove heap-helpers code
  dma-buf: heaps: Skip sync if not mapped
  dma-buf: system_heap: Allocate higher order pages if available
  dma-buf: dma-heap: Keep track of the heap device struct
  dma-buf: system_heap: Add a system-uncached heap re-using the system
    heap

 drivers/dma-buf/dma-heap.c           |  33 +-
 drivers/dma-buf/heaps/Makefile       |   1 -
 drivers/dma-buf/heaps/cma_heap.c     | 324 +++++++++++++++---
 drivers/dma-buf/heaps/heap-helpers.c | 270 ---------------
 drivers/dma-buf/heaps/heap-helpers.h |  53 ---
 drivers/dma-buf/heaps/system_heap.c  | 488 ++++++++++++++++++++++++---
 include/linux/dma-heap.h             |   9 +
 7 files changed, 747 insertions(+), 431 deletions(-)
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h

-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: [PATCH v4 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap
@ 2020-10-30  7:50 Hillf Danton
  2020-10-30 19:21 ` John Stultz
  0 siblings, 1 reply; 17+ messages in thread
From: Hillf Danton @ 2020-10-30  7:50 UTC (permalink / raw)
  To: John Stultz
  Cc: James Jones, Robin Murphy, Liam Mark, lkml, Christoph Hellwig,
	dri-devel, Ezequiel Garcia, linux-media

On Thu, 29 Oct 2020 21:04:30 -0700 John Stultz wrote:
> 
> But I'll try to share my thoughts:
> 
> So the system heap allows for allocation of non-contiguous buffers
> (currently allocated from page_alloc), which we keep track using
> sglists.
> Since the resulting dmabufs are shared between multiple devices, we
> want to provide a *specific type of memory* (in this case
> non-contiguous system memory), rather than what the underlying
> dma_alloc_attr() allocates for a specific device.

If the memory slice(just a page for simple case) from
dma_alloc_attr(for device-A) does not work for device-B, nor can
page_alloc do the job imho.
> 
> My sense is dma_mmap_wc() likely ought to be paired with switching to
> using dma_alloc_wc() as well, which calls down to dma_alloc_attr().
> Maybe one could use dma_alloc_attr against the heap device to allocate
> chunks that we track in the sglist. But I'm not sure how that saves us
> much other than possibly swapping dma_mmap_wc() for remap_pfn_range()?
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread
* Re: [PATCH v4 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap
@ 2020-10-30  2:47 Hillf Danton
  2020-10-30  4:04 ` John Stultz
  0 siblings, 1 reply; 17+ messages in thread
From: Hillf Danton @ 2020-10-30  2:47 UTC (permalink / raw)
  To: John Stultz
  Cc: James Jones, Robin Murphy, Liam Mark, lkml, Christoph Hellwig,
	dri-devel, Ezequiel Garcia, linux-media

On Thu, 29 Oct 2020 15:28:34 -0700 John Stultz wrote:
> On Thu, Oct 29, 2020 at 12:10 AM Hillf Danton <hdanton@sina.com> wrote:
> > On Thu, 29 Oct 2020 00:16:24 +0000 John Stultz wrote:
> > > @@ -194,6 +210,9 @@ static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> > >       struct sg_page_iter piter;
> > >       int ret;
> > >
> > > +     if (buffer->uncached)
> > > +             vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
> > > +
> >
> > Wonder why you turn back to dma_mmap_wc() and friends?
> 
> Sorry, can you expand on what you are proposing here instead?  I'm not
> sure I see how dma_alloc/mmap/*_wc() quite fits here.

I just wondered if *_wc() could save you two minutes or three. Can you
shed some light on your concerns about their unfitness?
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread
* [PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation
@ 2020-10-17  1:32 John Stultz
  2020-10-17  1:32 ` [PATCH v4 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap John Stultz
  0 siblings, 1 reply; 17+ messages in thread
From: John Stultz @ 2020-10-17  1:32 UTC (permalink / raw)
  To: lkml
  Cc: Sandeep Patil, dri-devel, Ezequiel Garcia, Robin Murphy,
	James Jones, Liam Mark, Laura Abbott, Chris Goldsworthy,
	Hridya Valsaraju, Ørjan Eide, linux-media,
	Suren Baghdasaryan, Daniel Mentz

Hey All,
  So this is another revision of my patch series to performance
optimizations to the dma-buf system heap.

This series reworks the system heap to use sgtables, and then
consolidates the pagelist method from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked.

After this, the series introduces an optimization that
Ørjan Eide implemented for ION that avoids calling sync on
attachments that don't have a mapping.

Next, an optimization to use larger order pages for the system
heap. This change brings us closer to the current performance
of the ION allocation code (though there still is a gap due
to ION using a mix of deferred-freeing and page pools, I'll be
looking at integrating those eventually).

Finally, a reworked version of my uncached system heap
implementation I was submitting a few weeks back. Since it
duplicated a lot of the now reworked system heap code, I
realized it would be much simpler to add the functionality to
the system_heap implementaiton itself.

While not improving the core allocation performance, the
uncached heap allocations do result in *much* improved
performance on HiKey960 as it avoids a lot of flushing and
invalidating buffers that the cpu doesn't touch often.

Feedback on these would be great!

thanks
-john

New in v4:
* Make sys_heap static (indirectly) Reported-by:
     kernel test robot <lkp@intel.com>
* Spelling fixes suggested by BrianS
* Make sys_uncached_heap static, as
    Reported-by: kernel test robot <lkp@intel.com>
* Fix wrong return value, caught by smatch
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
* Ensure we call flush/invalidate_kernel_vmap_range() in the
  uncached cases to try to address feedback about VIVT caches
  from Christoph
* Reorder a few lines as suggested by BrianS
* Avoid holding the initial mapping for the lifetime of the buffer
  as suggested by BrianS
* Fix a unlikely race between allocate and updating the dma_mask
  that BrianS noticed.


Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org

John Stultz (7):
  dma-buf: system_heap: Rework system heap to use sgtables instead of
    pagelists
  dma-buf: heaps: Move heap-helper logic into the cma_heap
    implementation
  dma-buf: heaps: Remove heap-helpers code
  dma-buf: heaps: Skip sync if not mapped
  dma-buf: system_heap: Allocate higher order pages if available
  dma-buf: dma-heap: Keep track of the heap device struct
  dma-buf: system_heap: Add a system-uncached heap re-using the system
    heap

 drivers/dma-buf/dma-heap.c           |  33 +-
 drivers/dma-buf/heaps/Makefile       |   1 -
 drivers/dma-buf/heaps/cma_heap.c     | 327 +++++++++++++++---
 drivers/dma-buf/heaps/heap-helpers.c | 270 ---------------
 drivers/dma-buf/heaps/heap-helpers.h |  53 ---
 drivers/dma-buf/heaps/system_heap.c  | 488 ++++++++++++++++++++++++---
 include/linux/dma-heap.h             |   9 +
 7 files changed, 749 insertions(+), 432 deletions(-)
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h

-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-10-30 19:21 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-29  0:16 [RESEND][PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation John Stultz
2020-10-29  0:16 ` [PATCH v4 1/7] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists John Stultz
2020-10-29  0:16 ` [PATCH v4 2/7] dma-buf: heaps: Move heap-helper logic into the cma_heap implementation John Stultz
2020-10-29  0:16 ` [PATCH v4 3/7] dma-buf: heaps: Remove heap-helpers code John Stultz
2020-10-29  0:16 ` [PATCH v4 4/7] dma-buf: heaps: Skip sync if not mapped John Stultz
2020-10-29  0:16 ` [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available John Stultz
2020-10-29  7:02   ` Hillf Danton
2020-10-29 19:34     ` John Stultz
2020-10-29  0:16 ` [PATCH v4 6/7] dma-buf: dma-heap: Keep track of the heap device struct John Stultz
2020-10-29  0:16 ` [PATCH v4 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap John Stultz
2020-10-29  7:10   ` Hillf Danton
2020-10-29 22:28     ` John Stultz
  -- strict thread matches above, loose matches on Subject: below --
2020-10-30  7:50 Hillf Danton
2020-10-30 19:21 ` John Stultz
2020-10-30  2:47 Hillf Danton
2020-10-30  4:04 ` John Stultz
2020-10-17  1:32 [PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation John Stultz
2020-10-17  1:32 ` [PATCH v4 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap John Stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).