All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Avoid pessimistic scatter-gather allocation
@ 2016-10-21 14:11 ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2016-10-21 14:11 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, linux-media, Chris Wilson, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can decrease the i915 kernel memory usage by doing more sg list
coallescing and avoiding the pessimistic list allocation.

At the moment we got two places in our code, the main shmemfs backed
object allocator, and the userptr object allocator, which both can
allocate sg list size pessimistically, and in the latter case also do
not exploit entry coallescing when it is possible.

This results in between one to six megabytes of memory wasted on unused
sg list entries under some common workloads:

    * Logging into KDE there is 1-2 MiB of unused sg entries.
    * Running the T-Rex benchamrk aroun 3 Mib.
    * Similarly for Manhattan 5-6 MiB.

To remove this wastage this series starts with some cleanups in the
sg_alloc_table_from_pages implementation and then adds and exports a new
__sg_alloc_table_from_pages function.

This then gets used by the i915 driver to achieve the described savings.

Tvrtko Ursulin (5):
  lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
  lib/scatterlist: Avoid potential scatterlist entry overflow
  lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  drm/i915: Use __sg_alloc_table_from_pages for allocating object
    backing store
  drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

 drivers/gpu/drm/i915/i915_drv.h                |  9 +++
 drivers/gpu/drm/i915/i915_gem.c                | 77 +++++++++++--------------
 drivers/gpu/drm/i915/i915_gem_userptr.c        | 29 +++-------
 drivers/media/v4l2-core/videobuf2-dma-contig.c |  4 +-
 drivers/rapidio/devices/rio_mport_cdev.c       |  4 +-
 include/linux/scatterlist.h                    | 11 ++--
 lib/scatterlist.c                              | 78 ++++++++++++++++++++------
 7 files changed, 120 insertions(+), 92 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread
* [PATCH 1/5] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
@ 2017-07-31 18:55 Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2017-07-31 18:55 UTC (permalink / raw)
  To: Intel-gfx
  Cc: tursulin, Tvrtko Ursulin, Masahiro Yamada, Pawel Osciak,
	Marek Szyprowski, Kyungmin Park, Tomasz Stanislawski,
	Matt Porter, Alexandre Bounine, linux-media, linux-kernel

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.

Since these are offsets withing a page, unsigned int is
wide enough.

Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.

v2: Use offset_in_page. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
 drivers/rapidio/devices/rio_mport_cdev.c       | 4 ++--
 include/linux/scatterlist.h                    | 2 +-
 lib/scatterlist.c                              | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 4f246d166111..2405077fdc71 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -479,7 +479,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 {
 	struct vb2_dc_buf *buf;
 	struct frame_vector *vec;
-	unsigned long offset;
+	unsigned int offset;
 	int n_pages, i;
 	int ret = 0;
 	struct sg_table *sgt;
@@ -507,7 +507,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 	buf->dev = dev;
 	buf->dma_dir = dma_dir;
 
-	offset = vaddr & ~PAGE_MASK;
+	offset = lower_32_bits(offset_in_page(vaddr));
 	vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
 	if (IS_ERR(vec)) {
 		ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
index 5beb0c361076..5c1b6388122a 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
 	 * offset within the internal buffer specified by handle parameter.
 	 */
 	if (xfer->loc_addr) {
-		unsigned long offset;
+		unsigned int offset;
 		long pinned;
 
-		offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+		offset = lower_32_bits(offset_in_page(xfer->loc_addr));
 		nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
 
 		page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 4b3286ac60c8..205aefb4ed93 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
 int sg_alloc_table_from_pages(struct sg_table *sgt,
 	struct page **pages, unsigned int n_pages,
-	unsigned long offset, unsigned long size,
+	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index be7b4dd6b68d..dee0c5004e2f 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
  */
 int sg_alloc_table_from_pages(struct sg_table *sgt,
 	struct page **pages, unsigned int n_pages,
-	unsigned long offset, unsigned long size,
+	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask)
 {
 	unsigned int chunks;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-07-31 18:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-21 14:11 [PATCH 0/5] Avoid pessimistic scatter-gather allocation Tvrtko Ursulin
2016-10-21 14:11 ` Tvrtko Ursulin
2016-10-21 14:11 ` [PATCH 1/5] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2016-10-21 14:11   ` Tvrtko Ursulin
2016-10-24  7:21   ` Marek Szyprowski
2016-10-21 14:11 ` [PATCH 2/5] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
2016-10-21 14:11 ` [PATCH 3/5] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
2016-10-21 14:11 ` [PATCH 4/5] drm/i915: Use __sg_alloc_table_from_pages for allocating object backing store Tvrtko Ursulin
2016-10-21 14:11   ` Tvrtko Ursulin
2016-10-21 14:27   ` Chris Wilson
2016-10-21 14:55     ` [Intel-gfx] " Tvrtko Ursulin
2016-10-21 14:55       ` Tvrtko Ursulin
2016-10-21 14:11 ` [PATCH 5/5] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
2016-10-21 14:11   ` Tvrtko Ursulin
2016-10-21 15:53 ` ✗ Fi.CI.BAT: failure for Avoid pessimistic scatter-gather allocation Patchwork
2017-07-31 18:55 [PATCH 1/5] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.