* Re: [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available
@ 2020-10-30 2:34 Hillf Danton
2020-10-30 3:30 ` John Stultz
0 siblings, 1 reply; 6+ messages in thread
From: Hillf Danton @ 2020-10-30 2:34 UTC (permalink / raw)
To: John Stultz
Cc: James Jones, Robin Murphy, Liam Mark, lkml, dri-devel,
Ezequiel Garcia, linux-media
On Thu, 29 Oct 2020 12:34:51 -0700 John Stultz wrote:
>
> As for your comment on HPAGE_PMD_ORDER (9 on arm64/arm) and
> PAGE_ALLOC_COSTLY_ORDER(3), I'm not totally sure I understand your
> question? Are you suggesting those values would be more natural orders
> to choose from?
The numbers, 9 and 3, are not magic themselves but under the mm diretory
they draw more attentions than others do. Sometimes it would take two
minutes for me to work out that HPAGE_PMD_ORDER does not mean 1MiB, on
platforms like arm64 or not.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available
2020-10-30 2:34 [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available Hillf Danton
@ 2020-10-30 3:30 ` John Stultz
0 siblings, 0 replies; 6+ messages in thread
From: John Stultz @ 2020-10-30 3:30 UTC (permalink / raw)
To: Hillf Danton
Cc: James Jones, Robin Murphy, Liam Mark, lkml, dri-devel,
Ezequiel Garcia, linux-media
On Thu, Oct 29, 2020 at 7:34 PM Hillf Danton <hdanton@sina.com> wrote:
> On Thu, 29 Oct 2020 12:34:51 -0700 John Stultz wrote:
> > As for your comment on HPAGE_PMD_ORDER (9 on arm64/arm) and
> > PAGE_ALLOC_COSTLY_ORDER(3), I'm not totally sure I understand your
> > question? Are you suggesting those values would be more natural orders
> > to choose from?
>
> The numbers, 9 and 3, are not magic themselves but under the mm diretory
> they draw more attentions than others do. Sometimes it would take two
> minutes for me to work out that HPAGE_PMD_ORDER does not mean 1MiB, on
> platforms like arm64 or not.
Yes, I can say it took me longer than two minutes to dig around and
work out HPAGE_PMD_ORDER for my last reply. :)
Though I'm still a bit unsure if you are proposing something more than
just a comment to explain why order 8 and order 4 allocations are used
in my patch? Please let me know if so.
thanks
-john
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* [RESEND][PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation
@ 2020-10-29 0:16 John Stultz
2020-10-29 0:16 ` [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available John Stultz
0 siblings, 1 reply; 6+ messages in thread
From: John Stultz @ 2020-10-29 0:16 UTC (permalink / raw)
To: lkml
Cc: Sandeep Patil, dri-devel, Ezequiel Garcia, Robin Murphy,
James Jones, Liam Mark, Laura Abbott, Chris Goldsworthy,
Hridya Valsaraju, Ørjan Eide, linux-media,
Suren Baghdasaryan, Daniel Mentz
Hey All,
So just wanted to resend my last revision of my patch series
of performance optimizations to the dma-buf system heap.
This series reworks the system heap to use sgtables, and then
consolidates the pagelist method from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked. As more heaps show up I
think we'll have a better idea how to best share code, so for
now I think this is ok.
After this, the series introduces an optimization that
Ørjan Eide implemented for ION that avoids calling sync on
attachments that don't have a mapping.
Next, an optimization to use larger order pages for the system
heap. This change brings us closer to the current performance
of the ION allocation code (though there still is a gap due
to ION using a mix of deferred-freeing and page pools, I'll be
looking at integrating those eventually).
Finally, a reworked version of my uncached system heap
implementation I was submitting a few weeks back. Since it
duplicated a lot of the now reworked system heap code, I
realized it would be much simpler to add the functionality to
the system_heap implementation itself.
While not improving the core allocation performance, the
uncached heap allocations do result in *much* improved
performance on HiKey960 as it avoids a lot of flushing and
invalidating buffers that the cpu doesn't touch often.
Feedback on these would be great!
thanks
-john
New in v4:
* Make sys_heap static (indirectly) Reported-by:
kernel test robot <lkp@intel.com>
* Spelling fixes suggested by BrianS
* Make sys_uncached_heap static, as
Reported-by: kernel test robot <lkp@intel.com>
* Fix wrong return value, caught by smatch
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
* Ensure we call flush/invalidate_kernel_vmap_range() in the
uncached cases to try to address feedback about VIVT caches
from Christoph
* Reorder a few lines as suggested by BrianS
* Avoid holding the initial mapping for the lifetime of the buffer
as suggested by BrianS
* Fix a unlikely race between allocate and updating the dma_mask
that BrianS noticed.
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
John Stultz (7):
dma-buf: system_heap: Rework system heap to use sgtables instead of
pagelists
dma-buf: heaps: Move heap-helper logic into the cma_heap
implementation
dma-buf: heaps: Remove heap-helpers code
dma-buf: heaps: Skip sync if not mapped
dma-buf: system_heap: Allocate higher order pages if available
dma-buf: dma-heap: Keep track of the heap device struct
dma-buf: system_heap: Add a system-uncached heap re-using the system
heap
drivers/dma-buf/dma-heap.c | 33 +-
drivers/dma-buf/heaps/Makefile | 1 -
drivers/dma-buf/heaps/cma_heap.c | 324 +++++++++++++++---
drivers/dma-buf/heaps/heap-helpers.c | 270 ---------------
drivers/dma-buf/heaps/heap-helpers.h | 53 ---
drivers/dma-buf/heaps/system_heap.c | 488 ++++++++++++++++++++++++---
include/linux/dma-heap.h | 9 +
7 files changed, 747 insertions(+), 431 deletions(-)
delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h
--
2.17.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available
2020-10-29 0:16 [RESEND][PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation John Stultz
@ 2020-10-29 0:16 ` John Stultz
2020-10-29 7:02 ` Hillf Danton
0 siblings, 1 reply; 6+ messages in thread
From: John Stultz @ 2020-10-29 0:16 UTC (permalink / raw)
To: lkml
Cc: Sandeep Patil, dri-devel, Ezequiel Garcia, Robin Murphy,
James Jones, Liam Mark, Laura Abbott, Chris Goldsworthy,
Hridya Valsaraju, Ørjan Eide, linux-media,
Suren Baghdasaryan, Daniel Mentz
While the system heap can return non-contiguous pages,
try to allocate larger order pages if possible.
This will allow slight performance gains and make implementing
page pooling easier.
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Brian Starkey <brian.starkey@arm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
v3:
* Use page_size() rather then opencoding it
---
drivers/dma-buf/heaps/system_heap.c | 83 ++++++++++++++++++++++-------
1 file changed, 65 insertions(+), 18 deletions(-)
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 15b36bc862b1..ef4b2c1032df 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -40,6 +40,14 @@ struct dma_heap_attachment {
bool mapped;
};
+#define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
+ | __GFP_NORETRY) & ~__GFP_RECLAIM) \
+ | __GFP_COMP)
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
+static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
+static const unsigned int orders[] = {8, 4, 0};
+#define NUM_ORDERS ARRAY_SIZE(orders)
+
static struct sg_table *dup_sg_table(struct sg_table *table)
{
struct sg_table *new_table;
@@ -270,8 +278,11 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
int i;
table = &buffer->sg_table;
- for_each_sgtable_sg(table, sg, i)
- __free_page(sg_page(sg));
+ for_each_sg(table->sgl, sg, table->nents, i) {
+ struct page *page = sg_page(sg);
+
+ __free_pages(page, compound_order(page));
+ }
sg_free_table(table);
kfree(buffer);
}
@@ -289,6 +300,26 @@ static const struct dma_buf_ops system_heap_buf_ops = {
.release = system_heap_dma_buf_release,
};
+static struct page *alloc_largest_available(unsigned long size,
+ unsigned int max_order)
+{
+ struct page *page;
+ int i;
+
+ for (i = 0; i < NUM_ORDERS; i++) {
+ if (size < (PAGE_SIZE << orders[i]))
+ continue;
+ if (max_order < orders[i])
+ continue;
+
+ page = alloc_pages(order_flags[i], orders[i]);
+ if (!page)
+ continue;
+ return page;
+ }
+ return NULL;
+}
+
static int system_heap_allocate(struct dma_heap *heap,
unsigned long len,
unsigned long fd_flags,
@@ -296,11 +327,13 @@ static int system_heap_allocate(struct dma_heap *heap,
{
struct system_heap_buffer *buffer;
DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
+ unsigned long size_remaining = len;
+ unsigned int max_order = orders[0];
struct dma_buf *dmabuf;
struct sg_table *table;
struct scatterlist *sg;
- pgoff_t pagecount;
- pgoff_t pg;
+ struct list_head pages;
+ struct page *page, *tmp_page;
int i, ret = -ENOMEM;
buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
@@ -312,25 +345,35 @@ static int system_heap_allocate(struct dma_heap *heap,
buffer->heap = heap;
buffer->len = len;
- table = &buffer->sg_table;
- pagecount = len / PAGE_SIZE;
- if (sg_alloc_table(table, pagecount, GFP_KERNEL))
- goto free_buffer;
-
- sg = table->sgl;
- for (pg = 0; pg < pagecount; pg++) {
- struct page *page;
+ INIT_LIST_HEAD(&pages);
+ i = 0;
+ while (size_remaining > 0) {
/*
* Avoid trying to allocate memory if the process
* has been killed by SIGKILL
*/
if (fatal_signal_pending(current))
- goto free_pages;
- page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+ goto free_buffer;
+
+ page = alloc_largest_available(size_remaining, max_order);
if (!page)
- goto free_pages;
+ goto free_buffer;
+
+ list_add_tail(&page->lru, &pages);
+ size_remaining -= page_size(page);
+ max_order = compound_order(page);
+ i++;
+ }
+
+ table = &buffer->sg_table;
+ if (sg_alloc_table(table, i, GFP_KERNEL))
+ goto free_buffer;
+
+ sg = table->sgl;
+ list_for_each_entry_safe(page, tmp_page, &pages, lru) {
sg_set_page(sg, page, page_size(page), 0);
sg = sg_next(sg);
+ list_del(&page->lru);
}
/* create the dmabuf */
@@ -350,14 +393,18 @@ static int system_heap_allocate(struct dma_heap *heap,
/* just return, as put will call release and that will free */
return ret;
}
-
return ret;
free_pages:
- for_each_sgtable_sg(table, sg, i)
- __free_page(sg_page(sg));
+ for_each_sgtable_sg(table, sg, i) {
+ struct page *p = sg_page(sg);
+
+ __free_pages(p, compound_order(p));
+ }
sg_free_table(table);
free_buffer:
+ list_for_each_entry_safe(page, tmp_page, &pages, lru)
+ __free_pages(page, compound_order(page));
kfree(buffer);
return ret;
--
2.17.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available
2020-10-29 0:16 ` [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available John Stultz
@ 2020-10-29 7:02 ` Hillf Danton
2020-10-29 19:34 ` John Stultz
0 siblings, 1 reply; 6+ messages in thread
From: Hillf Danton @ 2020-10-29 7:02 UTC (permalink / raw)
To: John Stultz
Cc: James Jones, Robin Murphy, Liam Mark, lkml, dri-devel,
Ezequiel Garcia, linux-media
On Thu, 29 Oct 2020 00:16:22 +0000 John Stultz wrote:
>
> +#define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
> + | __GFP_NORETRY) & ~__GFP_RECLAIM) \
> + | __GFP_COMP)
> +#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
> +static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
> +static const unsigned int orders[] = {8, 4, 0};
> +#define NUM_ORDERS ARRAY_SIZE(orders)
A two-line comment helps much understand the ORDERs above if it specifies the
reasons behind the detour to HPAGE_PMD_ORDER and PAGE_ALLOC_COSTLY_ORDER.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available
2020-10-29 7:02 ` Hillf Danton
@ 2020-10-29 19:34 ` John Stultz
0 siblings, 0 replies; 6+ messages in thread
From: John Stultz @ 2020-10-29 19:34 UTC (permalink / raw)
To: Hillf Danton
Cc: James Jones, Robin Murphy, Liam Mark, lkml, dri-devel,
Ezequiel Garcia, linux-media
On Thu, Oct 29, 2020 at 12:02 AM Hillf Danton <hdanton@sina.com> wrote:
>
> On Thu, 29 Oct 2020 00:16:22 +0000 John Stultz wrote:
> >
> > +#define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
> > + | __GFP_NORETRY) & ~__GFP_RECLAIM) \
> > + | __GFP_COMP)
> > +#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
> > +static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
> > +static const unsigned int orders[] = {8, 4, 0};
> > +#define NUM_ORDERS ARRAY_SIZE(orders)
>
> A two-line comment helps much understand the ORDERs above if it specifies the
> reasons behind the detour to HPAGE_PMD_ORDER and PAGE_ALLOC_COSTLY_ORDER.
Thanks so much for the review and feedback!
So yes, this was pulled from ION's system heap:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ion/ion_system_heap.c#n20
But adding __GFP_COMP as that's added by ION in the pagepool code I
didn't include:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ion/ion_page_pool.c#n146
I unfortunately don't have a lot of detail on the exact rationale
(other than what I can pull from the commit log), I suspect it has to
do experiential knowledge of the majority of graphics buffers being
small multiples of 1M or 64K.
But I do agree some rationale in a comment would be helpful, and will
try to add that.
As for your comment on HPAGE_PMD_ORDER (9 on arm64/arm) and
PAGE_ALLOC_COSTLY_ORDER(3), I'm not totally sure I understand your
question? Are you suggesting those values would be more natural orders
to choose from?
Thanks again!
-john
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation
@ 2020-10-17 1:32 John Stultz
2020-10-17 1:32 ` [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available John Stultz
0 siblings, 1 reply; 6+ messages in thread
From: John Stultz @ 2020-10-17 1:32 UTC (permalink / raw)
To: lkml
Cc: Sandeep Patil, dri-devel, Ezequiel Garcia, Robin Murphy,
James Jones, Liam Mark, Laura Abbott, Chris Goldsworthy,
Hridya Valsaraju, Ørjan Eide, linux-media,
Suren Baghdasaryan, Daniel Mentz
Hey All,
So this is another revision of my patch series to performance
optimizations to the dma-buf system heap.
This series reworks the system heap to use sgtables, and then
consolidates the pagelist method from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked.
After this, the series introduces an optimization that
Ørjan Eide implemented for ION that avoids calling sync on
attachments that don't have a mapping.
Next, an optimization to use larger order pages for the system
heap. This change brings us closer to the current performance
of the ION allocation code (though there still is a gap due
to ION using a mix of deferred-freeing and page pools, I'll be
looking at integrating those eventually).
Finally, a reworked version of my uncached system heap
implementation I was submitting a few weeks back. Since it
duplicated a lot of the now reworked system heap code, I
realized it would be much simpler to add the functionality to
the system_heap implementaiton itself.
While not improving the core allocation performance, the
uncached heap allocations do result in *much* improved
performance on HiKey960 as it avoids a lot of flushing and
invalidating buffers that the cpu doesn't touch often.
Feedback on these would be great!
thanks
-john
New in v4:
* Make sys_heap static (indirectly) Reported-by:
kernel test robot <lkp@intel.com>
* Spelling fixes suggested by BrianS
* Make sys_uncached_heap static, as
Reported-by: kernel test robot <lkp@intel.com>
* Fix wrong return value, caught by smatch
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
* Ensure we call flush/invalidate_kernel_vmap_range() in the
uncached cases to try to address feedback about VIVT caches
from Christoph
* Reorder a few lines as suggested by BrianS
* Avoid holding the initial mapping for the lifetime of the buffer
as suggested by BrianS
* Fix a unlikely race between allocate and updating the dma_mask
that BrianS noticed.
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
John Stultz (7):
dma-buf: system_heap: Rework system heap to use sgtables instead of
pagelists
dma-buf: heaps: Move heap-helper logic into the cma_heap
implementation
dma-buf: heaps: Remove heap-helpers code
dma-buf: heaps: Skip sync if not mapped
dma-buf: system_heap: Allocate higher order pages if available
dma-buf: dma-heap: Keep track of the heap device struct
dma-buf: system_heap: Add a system-uncached heap re-using the system
heap
drivers/dma-buf/dma-heap.c | 33 +-
drivers/dma-buf/heaps/Makefile | 1 -
drivers/dma-buf/heaps/cma_heap.c | 327 +++++++++++++++---
drivers/dma-buf/heaps/heap-helpers.c | 270 ---------------
drivers/dma-buf/heaps/heap-helpers.h | 53 ---
drivers/dma-buf/heaps/system_heap.c | 488 ++++++++++++++++++++++++---
include/linux/dma-heap.h | 9 +
7 files changed, 749 insertions(+), 432 deletions(-)
delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h
--
2.17.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available
2020-10-17 1:32 [PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation John Stultz
@ 2020-10-17 1:32 ` John Stultz
0 siblings, 0 replies; 6+ messages in thread
From: John Stultz @ 2020-10-17 1:32 UTC (permalink / raw)
To: lkml
Cc: Sandeep Patil, dri-devel, Ezequiel Garcia, Robin Murphy,
James Jones, Liam Mark, Laura Abbott, Chris Goldsworthy,
Hridya Valsaraju, Ørjan Eide, linux-media,
Suren Baghdasaryan, Daniel Mentz
While the system heap can return non-contiguous pages,
try to allocate larger order pages if possible.
This will allow slight performance gains and make implementing
page pooling easier.
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Brian Starkey <brian.starkey@arm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
v3:
* Use page_size() rather then opencoding it
---
drivers/dma-buf/heaps/system_heap.c | 83 ++++++++++++++++++++++-------
1 file changed, 65 insertions(+), 18 deletions(-)
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 15b36bc862b1..ef4b2c1032df 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -40,6 +40,14 @@ struct dma_heap_attachment {
bool mapped;
};
+#define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
+ | __GFP_NORETRY) & ~__GFP_RECLAIM) \
+ | __GFP_COMP)
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
+static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
+static const unsigned int orders[] = {8, 4, 0};
+#define NUM_ORDERS ARRAY_SIZE(orders)
+
static struct sg_table *dup_sg_table(struct sg_table *table)
{
struct sg_table *new_table;
@@ -270,8 +278,11 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
int i;
table = &buffer->sg_table;
- for_each_sgtable_sg(table, sg, i)
- __free_page(sg_page(sg));
+ for_each_sg(table->sgl, sg, table->nents, i) {
+ struct page *page = sg_page(sg);
+
+ __free_pages(page, compound_order(page));
+ }
sg_free_table(table);
kfree(buffer);
}
@@ -289,6 +300,26 @@ static const struct dma_buf_ops system_heap_buf_ops = {
.release = system_heap_dma_buf_release,
};
+static struct page *alloc_largest_available(unsigned long size,
+ unsigned int max_order)
+{
+ struct page *page;
+ int i;
+
+ for (i = 0; i < NUM_ORDERS; i++) {
+ if (size < (PAGE_SIZE << orders[i]))
+ continue;
+ if (max_order < orders[i])
+ continue;
+
+ page = alloc_pages(order_flags[i], orders[i]);
+ if (!page)
+ continue;
+ return page;
+ }
+ return NULL;
+}
+
static int system_heap_allocate(struct dma_heap *heap,
unsigned long len,
unsigned long fd_flags,
@@ -296,11 +327,13 @@ static int system_heap_allocate(struct dma_heap *heap,
{
struct system_heap_buffer *buffer;
DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
+ unsigned long size_remaining = len;
+ unsigned int max_order = orders[0];
struct dma_buf *dmabuf;
struct sg_table *table;
struct scatterlist *sg;
- pgoff_t pagecount;
- pgoff_t pg;
+ struct list_head pages;
+ struct page *page, *tmp_page;
int i, ret = -ENOMEM;
buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
@@ -312,25 +345,35 @@ static int system_heap_allocate(struct dma_heap *heap,
buffer->heap = heap;
buffer->len = len;
- table = &buffer->sg_table;
- pagecount = len / PAGE_SIZE;
- if (sg_alloc_table(table, pagecount, GFP_KERNEL))
- goto free_buffer;
-
- sg = table->sgl;
- for (pg = 0; pg < pagecount; pg++) {
- struct page *page;
+ INIT_LIST_HEAD(&pages);
+ i = 0;
+ while (size_remaining > 0) {
/*
* Avoid trying to allocate memory if the process
* has been killed by SIGKILL
*/
if (fatal_signal_pending(current))
- goto free_pages;
- page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+ goto free_buffer;
+
+ page = alloc_largest_available(size_remaining, max_order);
if (!page)
- goto free_pages;
+ goto free_buffer;
+
+ list_add_tail(&page->lru, &pages);
+ size_remaining -= page_size(page);
+ max_order = compound_order(page);
+ i++;
+ }
+
+ table = &buffer->sg_table;
+ if (sg_alloc_table(table, i, GFP_KERNEL))
+ goto free_buffer;
+
+ sg = table->sgl;
+ list_for_each_entry_safe(page, tmp_page, &pages, lru) {
sg_set_page(sg, page, page_size(page), 0);
sg = sg_next(sg);
+ list_del(&page->lru);
}
/* create the dmabuf */
@@ -350,14 +393,18 @@ static int system_heap_allocate(struct dma_heap *heap,
/* just return, as put will call release and that will free */
return ret;
}
-
return ret;
free_pages:
- for_each_sgtable_sg(table, sg, i)
- __free_page(sg_page(sg));
+ for_each_sgtable_sg(table, sg, i) {
+ struct page *p = sg_page(sg);
+
+ __free_pages(p, compound_order(p));
+ }
sg_free_table(table);
free_buffer:
+ list_for_each_entry_safe(page, tmp_page, &pages, lru)
+ __free_pages(page, compound_order(page));
kfree(buffer);
return ret;
--
2.17.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-10-30 8:25 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-30 2:34 [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available Hillf Danton
2020-10-30 3:30 ` John Stultz
-- strict thread matches above, loose matches on Subject: below --
2020-10-29 0:16 [RESEND][PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation John Stultz
2020-10-29 0:16 ` [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available John Stultz
2020-10-29 7:02 ` Hillf Danton
2020-10-29 19:34 ` John Stultz
2020-10-17 1:32 [PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation John Stultz
2020-10-17 1:32 ` [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available John Stultz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).