From: John Stultz <john.stultz@linaro.org> To: lkml <linux-kernel@vger.kernel.org> Cc: "John Stultz" <john.stultz@linaro.org>, "Sumit Semwal" <sumit.semwal@linaro.org>, "Liam Mark" <lmark@codeaurora.org>, "Laura Abbott" <labbott@kernel.org>, "Brian Starkey" <Brian.Starkey@arm.com>, "Hridya Valsaraju" <hridya@google.com>, "Suren Baghdasaryan" <surenb@google.com>, "Sandeep Patil" <sspatil@google.com>, "Daniel Mentz" <danielmentz@google.com>, "Chris Goldsworthy" <cgoldswo@codeaurora.org>, "Ørjan Eide" <orjan.eide@arm.com>, "Robin Murphy" <robin.murphy@arm.com>, "Ezequiel Garcia" <ezequiel@collabora.com>, "Simon Ser" <contact@emersion.fr>, "James Jones" <jajones@nvidia.com>, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org Subject: [PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation Date: Sat, 17 Oct 2020 01:32:48 +0000 [thread overview] Message-ID: <20201017013255.43568-1-john.stultz@linaro.org> (raw) Hey All, So this is another revision of my patch series to performance optimizations to the dma-buf system heap. This series reworks the system heap to use sgtables, and then consolidates the pagelist method from the heap-helpers into the CMA heap. After which the heap-helpers logic is removed (as it is unused). I'd still like to find a better way to avoid some of the logic duplication in implementing the entire dma_buf_ops handlers per heap. But unfortunately that code is tied somewhat to how the buffer's memory is tracked. After this, the series introduces an optimization that Ørjan Eide implemented for ION that avoids calling sync on attachments that don't have a mapping. Next, an optimization to use larger order pages for the system heap. This change brings us closer to the current performance of the ION allocation code (though there still is a gap due to ION using a mix of deferred-freeing and page pools, I'll be looking at integrating those eventually). Finally, a reworked version of my uncached system heap implementation I was submitting a few weeks back. Since it duplicated a lot of the now reworked system heap code, I realized it would be much simpler to add the functionality to the system_heap implementaiton itself. While not improving the core allocation performance, the uncached heap allocations do result in *much* improved performance on HiKey960 as it avoids a lot of flushing and invalidating buffers that the cpu doesn't touch often. Feedback on these would be great! thanks -john New in v4: * Make sys_heap static (indirectly) Reported-by: kernel test robot <lkp@intel.com> * Spelling fixes suggested by BrianS * Make sys_uncached_heap static, as Reported-by: kernel test robot <lkp@intel.com> * Fix wrong return value, caught by smatch Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> * Ensure we call flush/invalidate_kernel_vmap_range() in the uncached cases to try to address feedback about VIVT caches from Christoph * Reorder a few lines as suggested by BrianS * Avoid holding the initial mapping for the lifetime of the buffer as suggested by BrianS * Fix a unlikely race between allocate and updating the dma_mask that BrianS noticed. Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Liam Mark <lmark@codeaurora.org> Cc: Laura Abbott <labbott@kernel.org> Cc: Brian Starkey <Brian.Starkey@arm.com> Cc: Hridya Valsaraju <hridya@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org> Cc: Ørjan Eide <orjan.eide@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Ezequiel Garcia <ezequiel@collabora.com> Cc: Simon Ser <contact@emersion.fr> Cc: James Jones <jajones@nvidia.com> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org John Stultz (7): dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists dma-buf: heaps: Move heap-helper logic into the cma_heap implementation dma-buf: heaps: Remove heap-helpers code dma-buf: heaps: Skip sync if not mapped dma-buf: system_heap: Allocate higher order pages if available dma-buf: dma-heap: Keep track of the heap device struct dma-buf: system_heap: Add a system-uncached heap re-using the system heap drivers/dma-buf/dma-heap.c | 33 +- drivers/dma-buf/heaps/Makefile | 1 - drivers/dma-buf/heaps/cma_heap.c | 327 +++++++++++++++--- drivers/dma-buf/heaps/heap-helpers.c | 270 --------------- drivers/dma-buf/heaps/heap-helpers.h | 53 --- drivers/dma-buf/heaps/system_heap.c | 488 ++++++++++++++++++++++++--- include/linux/dma-heap.h | 9 + 7 files changed, 749 insertions(+), 432 deletions(-) delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h -- 2.17.1
WARNING: multiple messages have this Message-ID (diff)
From: John Stultz <john.stultz@linaro.org> To: lkml <linux-kernel@vger.kernel.org> Cc: "Sandeep Patil" <sspatil@google.com>, dri-devel@lists.freedesktop.org, "Ezequiel Garcia" <ezequiel@collabora.com>, "Robin Murphy" <robin.murphy@arm.com>, "James Jones" <jajones@nvidia.com>, "Liam Mark" <lmark@codeaurora.org>, "Laura Abbott" <labbott@kernel.org>, "Chris Goldsworthy" <cgoldswo@codeaurora.org>, "Hridya Valsaraju" <hridya@google.com>, "Ørjan Eide" <orjan.eide@arm.com>, linux-media@vger.kernel.org, "Suren Baghdasaryan" <surenb@google.com>, "Daniel Mentz" <danielmentz@google.com> Subject: [PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation Date: Sat, 17 Oct 2020 01:32:48 +0000 [thread overview] Message-ID: <20201017013255.43568-1-john.stultz@linaro.org> (raw) Hey All, So this is another revision of my patch series to performance optimizations to the dma-buf system heap. This series reworks the system heap to use sgtables, and then consolidates the pagelist method from the heap-helpers into the CMA heap. After which the heap-helpers logic is removed (as it is unused). I'd still like to find a better way to avoid some of the logic duplication in implementing the entire dma_buf_ops handlers per heap. But unfortunately that code is tied somewhat to how the buffer's memory is tracked. After this, the series introduces an optimization that Ørjan Eide implemented for ION that avoids calling sync on attachments that don't have a mapping. Next, an optimization to use larger order pages for the system heap. This change brings us closer to the current performance of the ION allocation code (though there still is a gap due to ION using a mix of deferred-freeing and page pools, I'll be looking at integrating those eventually). Finally, a reworked version of my uncached system heap implementation I was submitting a few weeks back. Since it duplicated a lot of the now reworked system heap code, I realized it would be much simpler to add the functionality to the system_heap implementaiton itself. While not improving the core allocation performance, the uncached heap allocations do result in *much* improved performance on HiKey960 as it avoids a lot of flushing and invalidating buffers that the cpu doesn't touch often. Feedback on these would be great! thanks -john New in v4: * Make sys_heap static (indirectly) Reported-by: kernel test robot <lkp@intel.com> * Spelling fixes suggested by BrianS * Make sys_uncached_heap static, as Reported-by: kernel test robot <lkp@intel.com> * Fix wrong return value, caught by smatch Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> * Ensure we call flush/invalidate_kernel_vmap_range() in the uncached cases to try to address feedback about VIVT caches from Christoph * Reorder a few lines as suggested by BrianS * Avoid holding the initial mapping for the lifetime of the buffer as suggested by BrianS * Fix a unlikely race between allocate and updating the dma_mask that BrianS noticed. Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Liam Mark <lmark@codeaurora.org> Cc: Laura Abbott <labbott@kernel.org> Cc: Brian Starkey <Brian.Starkey@arm.com> Cc: Hridya Valsaraju <hridya@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org> Cc: Ørjan Eide <orjan.eide@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Ezequiel Garcia <ezequiel@collabora.com> Cc: Simon Ser <contact@emersion.fr> Cc: James Jones <jajones@nvidia.com> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org John Stultz (7): dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists dma-buf: heaps: Move heap-helper logic into the cma_heap implementation dma-buf: heaps: Remove heap-helpers code dma-buf: heaps: Skip sync if not mapped dma-buf: system_heap: Allocate higher order pages if available dma-buf: dma-heap: Keep track of the heap device struct dma-buf: system_heap: Add a system-uncached heap re-using the system heap drivers/dma-buf/dma-heap.c | 33 +- drivers/dma-buf/heaps/Makefile | 1 - drivers/dma-buf/heaps/cma_heap.c | 327 +++++++++++++++--- drivers/dma-buf/heaps/heap-helpers.c | 270 --------------- drivers/dma-buf/heaps/heap-helpers.h | 53 --- drivers/dma-buf/heaps/system_heap.c | 488 ++++++++++++++++++++++++--- include/linux/dma-heap.h | 9 + 7 files changed, 749 insertions(+), 432 deletions(-) delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h -- 2.17.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
next reply other threads:[~2020-10-17 6:01 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-10-17 1:32 John Stultz [this message] 2020-10-17 1:32 ` [PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation John Stultz 2020-10-17 1:32 ` [PATCH v4 1/7] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists John Stultz 2020-10-17 1:32 ` John Stultz 2020-10-17 1:32 ` [PATCH v4 2/7] dma-buf: heaps: Move heap-helper logic into the cma_heap implementation John Stultz 2020-10-17 1:32 ` John Stultz 2020-10-17 1:32 ` [PATCH v4 3/7] dma-buf: heaps: Remove heap-helpers code John Stultz 2020-10-17 1:32 ` John Stultz 2020-10-17 1:32 ` [PATCH v4 4/7] dma-buf: heaps: Skip sync if not mapped John Stultz 2020-10-17 1:32 ` John Stultz 2020-10-17 1:32 ` [PATCH v4 5/7] dma-buf: system_heap: Allocate higher order pages if available John Stultz 2020-10-17 1:32 ` John Stultz 2020-10-17 1:32 ` [PATCH v4 6/7] dma-buf: dma-heap: Keep track of the heap device struct John Stultz 2020-10-17 1:32 ` John Stultz 2020-10-17 1:32 ` [PATCH v4 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap John Stultz 2020-10-17 1:32 ` John Stultz
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20201017013255.43568-1-john.stultz@linaro.org \ --to=john.stultz@linaro.org \ --cc=Brian.Starkey@arm.com \ --cc=cgoldswo@codeaurora.org \ --cc=contact@emersion.fr \ --cc=danielmentz@google.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=ezequiel@collabora.com \ --cc=hridya@google.com \ --cc=jajones@nvidia.com \ --cc=labbott@kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-media@vger.kernel.org \ --cc=lmark@codeaurora.org \ --cc=orjan.eide@arm.com \ --cc=robin.murphy@arm.com \ --cc=sspatil@google.com \ --cc=sumit.semwal@linaro.org \ --cc=surenb@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.