linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Compact userptr object backing store allocation
@ 2016-11-11  8:50 Tvrtko Ursulin
  2016-11-11  8:50 ` [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11  8:50 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Userptr backing store with SWIOTBL active is currently allocated in the same
inefficient manner, with one sg entry per object page, as what the commit
871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") fixed
for regular GEM objects.

We can fix that by adding new a __sg_alloc_table_from_pages core function which
allows us to control the maximum desired coalesced segment size.

Other than that the series starts with two simple fixes to
sg_alloc_table_from_pages which deal with incorrect data type usage and a
theoretical overflow condition. Fixing the latter enables easy addition of the
above mentioned __sg_alloc_table_from_pages.

Tvrtko Ursulin (4):
  lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
  lib/scatterlist: Avoid potential scatterlist entry overflow
  lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

 drivers/gpu/drm/i915/i915_drv.h                |  9 +++
 drivers/gpu/drm/i915/i915_gem.c                | 15 +----
 drivers/gpu/drm/i915/i915_gem_userptr.c        | 28 ++-------
 drivers/media/v4l2-core/videobuf2-dma-contig.c |  4 +-
 drivers/rapidio/devices/rio_mport_cdev.c       |  4 +-
 include/linux/scatterlist.h                    | 11 ++--
 lib/scatterlist.c                              | 78 ++++++++++++++++++++------
 7 files changed, 87 insertions(+), 62 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
  2016-11-11  8:50 [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
@ 2016-11-11  8:50 ` Tvrtko Ursulin
  2016-11-14  9:55   ` [Intel-gfx] " Chris Wilson
  2016-11-11  8:50 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11  8:50 UTC (permalink / raw)
  To: Intel-gfx
  Cc: linux-kernel, Tvrtko Ursulin, Masahiro Yamada, Pawel Osciak,
	Marek Szyprowski, Kyungmin Park, Tomasz Stanislawski,
	Matt Porter, Alexandre Bounine, linux-media

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.

Since these are offsets withing a page, unsigned int is
wide enough.

Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.

v2: Use offset_in_page. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
 drivers/rapidio/devices/rio_mport_cdev.c       | 4 ++--
 include/linux/scatterlist.h                    | 2 +-
 lib/scatterlist.c                              | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index fb6a177be461..51e8765bc3c6 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 {
 	struct vb2_dc_buf *buf;
 	struct frame_vector *vec;
-	unsigned long offset;
+	unsigned int offset;
 	int n_pages, i;
 	int ret = 0;
 	struct sg_table *sgt;
@@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 	buf->dev = dev;
 	buf->dma_dir = dma_dir;
 
-	offset = vaddr & ~PAGE_MASK;
+	offset = lower_32_bits(offset_in_page(vaddr));
 	vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
 	if (IS_ERR(vec)) {
 		ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
index 9013a585507e..0fae29ff47ba 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
 	 * offset within the internal buffer specified by handle parameter.
 	 */
 	if (xfer->loc_addr) {
-		unsigned long offset;
+		unsigned int offset;
 		long pinned;
 
-		offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+		offset = lower_32_bits(offset_in_page(xfer->loc_addr));
 		nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
 
 		page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe6acd7..c981bee1a3ae 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
 int sg_alloc_table_from_pages(struct sg_table *sgt,
 	struct page **pages, unsigned int n_pages,
-	unsigned long offset, unsigned long size,
+	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 004fc70fc56a..e05e7fc98892 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
  */
 int sg_alloc_table_from_pages(struct sg_table *sgt,
 	struct page **pages, unsigned int n_pages,
-	unsigned long offset, unsigned long size,
+	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask)
 {
 	unsigned int chunks;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2016-11-11  8:50 [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
  2016-11-11  8:50 ` [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
@ 2016-11-11  8:50 ` Tvrtko Ursulin
  2016-11-11 10:19   ` [Intel-gfx] " Chris Wilson
  2016-11-11  8:50 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11  8:50 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, Tvrtko Ursulin, Masahiro Yamada

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Since the scatterlist length field is an unsigned int, make
sure that sg_alloc_table_from_pages does not overflow it while
coallescing pages to a single entry.

v2: Drop reference to future use. Use UINT_MAX.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
---
 lib/scatterlist.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index e05e7fc98892..de15f369b317 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask)
 {
-	unsigned int chunks;
+	const unsigned int max_segment = UINT_MAX;
+	unsigned int seg_len, chunks;
 	unsigned int i;
 	unsigned int cur_page;
 	int ret;
@@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
 	/* compute number of contiguous chunks */
 	chunks = 1;
-	for (i = 1; i < n_pages; ++i)
-		if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
+	seg_len = PAGE_SIZE;
+	for (i = 1; i < n_pages; ++i) {
+		if (seg_len >= max_segment ||
+		    page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
 			++chunks;
+			seg_len = PAGE_SIZE;
+		} else {
+			seg_len += PAGE_SIZE;
+		}
+	}
 
 	ret = sg_alloc_table(sgt, chunks, gfp_mask);
 	if (unlikely(ret))
@@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 	/* merging chunks and putting them into the scatterlist */
 	cur_page = 0;
 	for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
-		unsigned long chunk_size;
+		unsigned int chunk_size;
 		unsigned int j;
 
 		/* look for the end of the current chunk */
+		seg_len = PAGE_SIZE;
 		for (j = cur_page + 1; j < n_pages; ++j)
-			if (page_to_pfn(pages[j]) !=
+			if (seg_len >= max_segment ||
+			    page_to_pfn(pages[j]) !=
 			    page_to_pfn(pages[j - 1]) + 1)
 				break;
+			else
+				seg_len += PAGE_SIZE;
 
 		chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
-		sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
+		sg_set_page(s, pages[cur_page],
+			    min_t(unsigned long, size, chunk_size), offset);
 		size -= chunk_size;
 		offset = 0;
 		cur_page = j;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  2016-11-11  8:50 [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
  2016-11-11  8:50 ` [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
  2016-11-11  8:50 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
@ 2016-11-11  8:50 ` Tvrtko Ursulin
  2016-11-11 10:24   ` [Intel-gfx] " Chris Wilson
  2016-11-11  8:50 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
  2016-11-14 11:45 ` [Intel-gfx] [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
  4 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11  8:50 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, Tvrtko Ursulin, Masahiro Yamada

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
---
 include/linux/scatterlist.h | 11 +++++----
 lib/scatterlist.c           | 55 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..29591dbb20fd 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 		     struct scatterlist *, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, gfp_t gfp_mask,
+				unsigned int max_segment);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
 		      size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index de15f369b317..c0521420f931 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- *			       an array of pages
- * @sgt:	The sg table header to use
- * @pages:	Pointer to an array of page pointers
- * @n_pages:	Number of pages in the pages array
- * @offset:     Offset from start of the first page to the start of a buffer
- * @size:       Number of valid bytes in the buffer (after offset)
- * @gfp_mask:	GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			         an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:	 GFP allocation mask
+ * @max_segment: Maximum size of a single scatterlist node in bytes
  *
  *  Description:
  *    Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,12 +390,11 @@ EXPORT_SYMBOL(sg_alloc_table);
  * Returns:
  *   0 on success, negative error on failure
  */
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, gfp_t gfp_mask,
+				unsigned int max_segment)
 {
-	const unsigned int max_segment = UINT_MAX;
 	unsigned int seg_len, chunks;
 	unsigned int i;
 	unsigned int cur_page;
@@ -444,6 +444,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
 	return 0;
 }
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			       an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:	 GFP allocation mask
+ *
+ *  Description:
+ *    Allocate and initialize an sg table from a list of pages. Contiguous
+ *    ranges of the pages are squashed into a single scatterlist node. A user
+ *    may provide an offset at a start and a size of valid data in a buffer
+ *    specified by the page array. The returned sg table is released by
+ *    sg_free_table.
+ *
+ * Returns:
+ *   0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask)
+{
+	return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset,
+					   size, gfp_mask, UINT_MAX);
+}
 EXPORT_SYMBOL(sg_alloc_table_from_pages);
 
 void __sg_page_iter_start(struct sg_page_iter *piter,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
  2016-11-11  8:50 [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2016-11-11  8:50 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
@ 2016-11-11  8:50 ` Tvrtko Ursulin
  2016-11-11  8:59   ` [PATCH v3 " Tvrtko Ursulin
  2016-11-11 10:23   ` [PATCH " Chris Wilson
  2016-11-14 11:45 ` [Intel-gfx] [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
  4 siblings, 2 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11  8:50 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, Tvrtko Ursulin, Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
 * Rename helper to i915_sg_segment_size and fix swiotlb override.
 * Commit message update.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h         |  9 +++++++++
 drivers/gpu/drm/i915/i915_gem.c         | 15 +--------------
 drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++++++----------------------
 3 files changed, 16 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30777dee3f9c..319f8def0f86 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4175,4 +4175,13 @@ int remap_io_mapping(struct vm_area_struct *vma,
 	__T;								\
 })
 
+static inline unsigned int i915_sg_segment_size(void)
+{
+#if IS_ENABLED(CONFIG_SWIOTLB)
+	return swiotlb_nr_tbl() << IO_TLB_SHIFT;
+#else
+	return UINT_MAX;
+#endif
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c20edba7f2a..cb4c188a395c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 	mutex_unlock(&obj->mm.lock);
 }
 
-static unsigned int swiotlb_max_size(void)
-{
-#if IS_ENABLED(CONFIG_SWIOTLB)
-	return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
-#else
-	return 0;
-#endif
-}
-
 static void i915_sg_trim(struct sg_table *orig_st)
 {
 	struct sg_table new_st;
@@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	struct sgt_iter sgt_iter;
 	struct page *page;
 	unsigned long last_pfn = 0;	/* suppress gcc warning */
-	unsigned int max_segment;
+	unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
 	int ret;
 	gfp_t gfp;
 
@@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
 	GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-	max_segment = swiotlb_max_size();
-	if (!max_segment)
-		max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (st == NULL)
 		return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 64261639f547..b4461f1832a6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,36 +390,20 @@ struct get_pages_work {
 	struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
 static int
 st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
 {
-	struct scatterlist *sg;
-	int ret, n;
+	int ret;
 
 	*st = kmalloc(sizeof(**st), GFP_KERNEL);
 	if (*st == NULL)
 		return -ENOMEM;
 
-	if (swiotlb_active()) {
-		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-		if (ret)
-			goto err;
-
-		for_each_sg((*st)->sgl, sg, num_pages, n)
-			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-	} else {
-		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-						0, num_pages << PAGE_SHIFT,
-						GFP_KERNEL);
-		if (ret)
-			goto err;
-	}
+	ret = __sg_alloc_table_from_pages(*st, pvec, num_pages, 0,
+					  num_pages << PAGE_SHIFT,
+					  GFP_KERNEL, i915_sg_segment_size());
+	if (ret)
+		goto err;
 
 	return 0;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
  2016-11-11  8:50 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
@ 2016-11-11  8:59   ` Tvrtko Ursulin
  2016-11-11 10:23   ` [PATCH " Chris Wilson
  1 sibling, 0 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11  8:59 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, Tvrtko Ursulin, Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
 * Rename helper to i915_sg_segment_size and fix swiotlb override.
 * Commit message update.

v3:
 * Actually include the swiotlb override fix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h         | 11 +++++++++++
 drivers/gpu/drm/i915/i915_gem.c         | 15 +--------------
 drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++++++----------------------
 3 files changed, 18 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30777dee3f9c..38a555c9c44b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4175,4 +4175,15 @@ int remap_io_mapping(struct vm_area_struct *vma,
 	__T;								\
 })
 
+static inline unsigned int i915_sg_segment_size(void)
+{
+#if IS_ENABLED(CONFIG_SWIOTLB)
+	unsigned int nr_tbl = swiotlb_nr_tbl();
+
+	return nr_tbl > 0 ? nr_tbl << IO_TLB_SHIFT : UINT_MAX;
+#else
+	return UINT_MAX;
+#endif
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c20edba7f2a..cb4c188a395c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 	mutex_unlock(&obj->mm.lock);
 }
 
-static unsigned int swiotlb_max_size(void)
-{
-#if IS_ENABLED(CONFIG_SWIOTLB)
-	return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
-#else
-	return 0;
-#endif
-}
-
 static void i915_sg_trim(struct sg_table *orig_st)
 {
 	struct sg_table new_st;
@@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	struct sgt_iter sgt_iter;
 	struct page *page;
 	unsigned long last_pfn = 0;	/* suppress gcc warning */
-	unsigned int max_segment;
+	unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
 	int ret;
 	gfp_t gfp;
 
@@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
 	GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-	max_segment = swiotlb_max_size();
-	if (!max_segment)
-		max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (st == NULL)
 		return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 64261639f547..b4461f1832a6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,36 +390,20 @@ struct get_pages_work {
 	struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
 static int
 st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
 {
-	struct scatterlist *sg;
-	int ret, n;
+	int ret;
 
 	*st = kmalloc(sizeof(**st), GFP_KERNEL);
 	if (*st == NULL)
 		return -ENOMEM;
 
-	if (swiotlb_active()) {
-		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-		if (ret)
-			goto err;
-
-		for_each_sg((*st)->sgl, sg, num_pages, n)
-			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-	} else {
-		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-						0, num_pages << PAGE_SHIFT,
-						GFP_KERNEL);
-		if (ret)
-			goto err;
-	}
+	ret = __sg_alloc_table_from_pages(*st, pvec, num_pages, 0,
+					  num_pages << PAGE_SHIFT,
+					  GFP_KERNEL, i915_sg_segment_size());
+	if (ret)
+		goto err;
 
 	return 0;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2016-11-11  8:50 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
@ 2016-11-11 10:19   ` Chris Wilson
  0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2016-11-11 10:19 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx, Masahiro Yamada, linux-kernel

On Fri, Nov 11, 2016 at 08:50:18AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Since the scatterlist length field is an unsigned int, make
> sure that sg_alloc_table_from_pages does not overflow it while
> coallescing pages to a single entry.
> 
> v2: Drop reference to future use. Use UINT_MAX.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: linux-kernel@vger.kernel.org
> ---
>  lib/scatterlist.c | 25 +++++++++++++++++++------
>  1 file changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index e05e7fc98892..de15f369b317 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>  	unsigned int offset, unsigned long size,
>  	gfp_t gfp_mask)
>  {
> -	unsigned int chunks;
> +	const unsigned int max_segment = UINT_MAX;
> +	unsigned int seg_len, chunks;
>  	unsigned int i;
>  	unsigned int cur_page;
>  	int ret;
> @@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>  
>  	/* compute number of contiguous chunks */
>  	chunks = 1;
> -	for (i = 1; i < n_pages; ++i)
> -		if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
> +	seg_len = PAGE_SIZE;
> +	for (i = 1; i < n_pages; ++i) {
> +		if (seg_len >= max_segment ||
> +		    page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
>  			++chunks;
> +			seg_len = PAGE_SIZE;
> +		} else {
> +			seg_len += PAGE_SIZE;
> +		}
> +	}
>  
>  	ret = sg_alloc_table(sgt, chunks, gfp_mask);
>  	if (unlikely(ret))
> @@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>  	/* merging chunks and putting them into the scatterlist */
>  	cur_page = 0;
>  	for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
> -		unsigned long chunk_size;
> +		unsigned int chunk_size;
>  		unsigned int j;
>  
>  		/* look for the end of the current chunk */
> +		seg_len = PAGE_SIZE;
>  		for (j = cur_page + 1; j < n_pages; ++j)
> -			if (page_to_pfn(pages[j]) !=
> +			if (seg_len >= max_segment ||
> +			    page_to_pfn(pages[j]) !=
>  			    page_to_pfn(pages[j - 1]) + 1)
>  				break;
> +			else
> +				seg_len += PAGE_SIZE;
>  
>  		chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;

chunk_size = seg_len - offset;
?

> -		sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
> +		sg_set_page(s, pages[cur_page],
> +			    min_t(unsigned long, size, chunk_size), offset);
>  		size -= chunk_size;
>  		offset = 0;
>  		cur_page = j;

Could BUG_ON(size); afterwards and BUG_ON(!size) inside? Although that
requires the caller didn't make a mistake in n_pages vs offset+size.

Minor comments,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
  2016-11-11  8:50 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
  2016-11-11  8:59   ` [PATCH v3 " Tvrtko Ursulin
@ 2016-11-11 10:23   ` Chris Wilson
  2016-11-11 12:36     ` [PATCH v4 " Tvrtko Ursulin
  1 sibling, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2016-11-11 10:23 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx, linux-kernel, Tvrtko Ursulin

On Fri, Nov 11, 2016 at 08:50:20AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> With the addition of __sg_alloc_table_from_pages we can control
> the maximum coallescing size and eliminate a separate path for
> allocating backing store here.
> 
> Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
> SWIOTLB max segment size") this enables more compact sg lists to
> be created and so has a beneficial effect on workloads with many
> and/or large objects of this class.
> 
> v2:
>  * Rename helper to i915_sg_segment_size and fix swiotlb override.
>  * Commit message update.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  9 +++++++++
>  drivers/gpu/drm/i915/i915_gem.c         | 15 +--------------
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++++++----------------------
>  3 files changed, 16 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 30777dee3f9c..319f8def0f86 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -4175,4 +4175,13 @@ int remap_io_mapping(struct vm_area_struct *vma,
>  	__T;								\
>  })
>  
> +static inline unsigned int i915_sg_segment_size(void)
> +{
> +#if IS_ENABLED(CONFIG_SWIOTLB)
> +	return swiotlb_nr_tbl() << IO_TLB_SHIFT;
> +#else
> +	return UINT_MAX;
> +#endif
> +}
> +
>  #endif
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1c20edba7f2a..cb4c188a395c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
>  	mutex_unlock(&obj->mm.lock);
>  }
>  
> -static unsigned int swiotlb_max_size(void)
> -{
> -#if IS_ENABLED(CONFIG_SWIOTLB)
> -	return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
> -#else
> -	return 0;
> -#endif
> -}
> -
>  static void i915_sg_trim(struct sg_table *orig_st)
>  {
>  	struct sg_table new_st;
> @@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>  	struct sgt_iter sgt_iter;
>  	struct page *page;
>  	unsigned long last_pfn = 0;	/* suppress gcc warning */
> -	unsigned int max_segment;
> +	unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
>  	int ret;
>  	gfp_t gfp;
>  
> @@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>  	GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
>  	GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
>  
> -	max_segment = swiotlb_max_size();
> -	if (!max_segment)
> -		max_segment = rounddown(UINT_MAX, PAGE_SIZE);
> -
>  	st = kmalloc(sizeof(*st), GFP_KERNEL);
>  	if (st == NULL)
>  		return ERR_PTR(-ENOMEM);
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 64261639f547..b4461f1832a6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -390,36 +390,20 @@ struct get_pages_work {
>  	struct task_struct *task;
>  };
>  
> -#if IS_ENABLED(CONFIG_SWIOTLB)
> -#define swiotlb_active() swiotlb_nr_tbl()
> -#else
> -#define swiotlb_active() 0
> -#endif
> -
>  static int
>  st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
>  {
> -	struct scatterlist *sg;
> -	int ret, n;
> +	int ret;
>  
>  	*st = kmalloc(sizeof(**st), GFP_KERNEL);
>  	if (*st == NULL)
>  		return -ENOMEM;
>  
> -	if (swiotlb_active()) {
> -		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
> -		if (ret)
> -			goto err;
> -
> -		for_each_sg((*st)->sgl, sg, num_pages, n)
> -			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
> -	} else {
> -		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
> -						0, num_pages << PAGE_SHIFT,
> -						GFP_KERNEL);
> -		if (ret)
> -			goto err;
> -	}
> +	ret = __sg_alloc_table_from_pages(*st, pvec, num_pages, 0,
> +					  num_pages << PAGE_SHIFT,

Petty:

ret = __sg_alloc_table_from_pages(*st, pvec, num_pages,

pvec + num_pages are paired

				  0, num_pages << PAGE_SHIFT,
offset + size are paired

				  i915_sg_segment_size()),
				  GFP_KERNEL);

And for some reason I always like to see gfp_t last.

Otherwise looks good,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  2016-11-11  8:50 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
@ 2016-11-11 10:24   ` Chris Wilson
  2016-11-11 12:35     ` [PATCH v2 " Tvrtko Ursulin
  2016-11-11 14:17     ` [PATCH v3 " Tvrtko Ursulin
  0 siblings, 2 replies; 19+ messages in thread
From: Chris Wilson @ 2016-11-11 10:24 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx, Masahiro Yamada, linux-kernel

On Fri, Nov 11, 2016 at 08:50:19AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Drivers like i915 benefit from being able to control the maxium
> size of the sg coallesced segment while building the scatter-
> gather list.
> 
> Introduce and export the __sg_alloc_table_from_pages function
> which will allow it that control.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: linux-kernel@vger.kernel.org


> ---
>  include/linux/scatterlist.h | 11 +++++----
>  lib/scatterlist.c           | 55 ++++++++++++++++++++++++++++++++++-----------
>  2 files changed, 49 insertions(+), 17 deletions(-)
> 
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index c981bee1a3ae..29591dbb20fd 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
>  int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
>  		     struct scatterlist *, gfp_t, sg_alloc_fn *);
>  int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
> -int sg_alloc_table_from_pages(struct sg_table *sgt,
> -	struct page **pages, unsigned int n_pages,
> -	unsigned int offset, unsigned long size,
> -	gfp_t gfp_mask);
> +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
> +				unsigned int n_pages, unsigned int offset,
> +				unsigned long size, gfp_t gfp_mask,
> +				unsigned int max_segment);

Just the question of parameter order, I like gfp_t last :)
And I think offset / size / max_segment tie together.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  2016-11-11 10:24   ` [Intel-gfx] " Chris Wilson
@ 2016-11-11 12:35     ` Tvrtko Ursulin
  2016-11-11 14:17     ` [PATCH v3 " Tvrtko Ursulin
  1 sibling, 0 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11 12:35 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, Tvrtko Ursulin, Masahiro Yamada, Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 include/linux/scatterlist.h | 11 +++++----
 lib/scatterlist.c           | 55 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..16b740afeed2 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 		     struct scatterlist *, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, unsigned int max_segment,
+				gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
 		      size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index de15f369b317..8ee1ece81626 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- *			       an array of pages
- * @sgt:	The sg table header to use
- * @pages:	Pointer to an array of page pointers
- * @n_pages:	Number of pages in the pages array
- * @offset:     Offset from start of the first page to the start of a buffer
- * @size:       Number of valid bytes in the buffer (after offset)
- * @gfp_mask:	GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			         an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a single scatterlist node in bytes
+ * @gfp_mask:	 GFP allocation mask
  *
  *  Description:
  *    Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,12 +390,11 @@ EXPORT_SYMBOL(sg_alloc_table);
  * Returns:
  *   0 on success, negative error on failure
  */
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, unsigned int max_segment,
+				gfp_t gfp_mask)
 {
-	const unsigned int max_segment = UINT_MAX;
 	unsigned int seg_len, chunks;
 	unsigned int i;
 	unsigned int cur_page;
@@ -444,6 +444,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
 	return 0;
 }
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			       an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:	 GFP allocation mask
+ *
+ *  Description:
+ *    Allocate and initialize an sg table from a list of pages. Contiguous
+ *    ranges of the pages are squashed into a single scatterlist node. A user
+ *    may provide an offset at a start and a size of valid data in a buffer
+ *    specified by the page array. The returned sg table is released by
+ *    sg_free_table.
+ *
+ * Returns:
+ *   0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask)
+{
+	return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset,
+					   size, gfp_mask, UINT_MAX);
+}
 EXPORT_SYMBOL(sg_alloc_table_from_pages);
 
 void __sg_page_iter_start(struct sg_page_iter *piter,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
  2016-11-11 10:23   ` [PATCH " Chris Wilson
@ 2016-11-11 12:36     ` Tvrtko Ursulin
  2016-11-14  9:52       ` Chris Wilson
  0 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11 12:36 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, Tvrtko Ursulin, Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
 * Rename helper to i915_sg_segment_size and fix swiotlb override.
 * Commit message update.

v3:
 * Actually include the swiotlb override fix.

v4:
 * Regroup parameters a bit. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
---
 drivers/gpu/drm/i915/i915_drv.h         | 11 +++++++++++
 drivers/gpu/drm/i915/i915_gem.c         | 15 +--------------
 drivers/gpu/drm/i915/i915_gem_userptr.c | 29 +++++++----------------------
 3 files changed, 19 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30777dee3f9c..38a555c9c44b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4175,4 +4175,15 @@ int remap_io_mapping(struct vm_area_struct *vma,
 	__T;								\
 })
 
+static inline unsigned int i915_sg_segment_size(void)
+{
+#if IS_ENABLED(CONFIG_SWIOTLB)
+	unsigned int nr_tbl = swiotlb_nr_tbl();
+
+	return nr_tbl > 0 ? nr_tbl << IO_TLB_SHIFT : UINT_MAX;
+#else
+	return UINT_MAX;
+#endif
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c20edba7f2a..cb4c188a395c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 	mutex_unlock(&obj->mm.lock);
 }
 
-static unsigned int swiotlb_max_size(void)
-{
-#if IS_ENABLED(CONFIG_SWIOTLB)
-	return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
-#else
-	return 0;
-#endif
-}
-
 static void i915_sg_trim(struct sg_table *orig_st)
 {
 	struct sg_table new_st;
@@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	struct sgt_iter sgt_iter;
 	struct page *page;
 	unsigned long last_pfn = 0;	/* suppress gcc warning */
-	unsigned int max_segment;
+	unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
 	int ret;
 	gfp_t gfp;
 
@@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
 	GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-	max_segment = swiotlb_max_size();
-	if (!max_segment)
-		max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (st == NULL)
 		return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 64261639f547..f3fcb2d1bde9 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,36 +390,21 @@ struct get_pages_work {
 	struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
 static int
 st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
 {
-	struct scatterlist *sg;
-	int ret, n;
+	int ret;
 
 	*st = kmalloc(sizeof(**st), GFP_KERNEL);
 	if (*st == NULL)
 		return -ENOMEM;
 
-	if (swiotlb_active()) {
-		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-		if (ret)
-			goto err;
-
-		for_each_sg((*st)->sgl, sg, num_pages, n)
-			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-	} else {
-		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-						0, num_pages << PAGE_SHIFT,
-						GFP_KERNEL);
-		if (ret)
-			goto err;
-	}
+	ret = __sg_alloc_table_from_pages(*st, pvec, num_pages,
+					  0, num_pages << PAGE_SHIFT,
+					  i915_sg_segment_size(),
+					  GFP_KERNEL);
+	if (ret)
+		goto err;
 
 	return 0;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  2016-11-11 10:24   ` [Intel-gfx] " Chris Wilson
  2016-11-11 12:35     ` [PATCH v2 " Tvrtko Ursulin
@ 2016-11-11 14:17     ` Tvrtko Ursulin
  2016-11-14  9:51       ` Chris Wilson
  1 sibling, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11 14:17 UTC (permalink / raw)
  To: Intel-gfx; +Cc: linux-kernel, Tvrtko Ursulin, Masahiro Yamada, Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)
v3: Fix incomplete reordering in v2.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 include/linux/scatterlist.h | 11 +++++----
 lib/scatterlist.c           | 55 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..16b740afeed2 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 		     struct scatterlist *, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, unsigned int max_segment,
+				gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
 		      size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index de15f369b317..7f4d637d3606 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- *			       an array of pages
- * @sgt:	The sg table header to use
- * @pages:	Pointer to an array of page pointers
- * @n_pages:	Number of pages in the pages array
- * @offset:     Offset from start of the first page to the start of a buffer
- * @size:       Number of valid bytes in the buffer (after offset)
- * @gfp_mask:	GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			         an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a single scatterlist node in bytes
+ * @gfp_mask:	 GFP allocation mask
  *
  *  Description:
  *    Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,12 +390,11 @@ EXPORT_SYMBOL(sg_alloc_table);
  * Returns:
  *   0 on success, negative error on failure
  */
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, unsigned int max_segment,
+				gfp_t gfp_mask)
 {
-	const unsigned int max_segment = UINT_MAX;
 	unsigned int seg_len, chunks;
 	unsigned int i;
 	unsigned int cur_page;
@@ -444,6 +444,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
 	return 0;
 }
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			       an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:	 GFP allocation mask
+ *
+ *  Description:
+ *    Allocate and initialize an sg table from a list of pages. Contiguous
+ *    ranges of the pages are squashed into a single scatterlist node. A user
+ *    may provide an offset at a start and a size of valid data in a buffer
+ *    specified by the page array. The returned sg table is released by
+ *    sg_free_table.
+ *
+ * Returns:
+ *   0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask)
+{
+	return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset,
+					   size, UINT_MAX, gfp_mask);
+}
 EXPORT_SYMBOL(sg_alloc_table_from_pages);
 
 void __sg_page_iter_start(struct sg_page_iter *piter,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  2016-11-11 14:17     ` [PATCH v3 " Tvrtko Ursulin
@ 2016-11-14  9:51       ` Chris Wilson
  0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2016-11-14  9:51 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx, linux-kernel, Tvrtko Ursulin, Masahiro Yamada

On Fri, Nov 11, 2016 at 02:17:44PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Drivers like i915 benefit from being able to control the maxium
> size of the sg coallesced segment while building the scatter-
> gather list.
> 
> Introduce and export the __sg_alloc_table_from_pages function
> which will allow it that control.
> 
> v2: Reorder parameters. (Chris Wilson)
> v3: Fix incomplete reordering in v2.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

Works for me,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
  2016-11-11 12:36     ` [PATCH v4 " Tvrtko Ursulin
@ 2016-11-14  9:52       ` Chris Wilson
  0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2016-11-14  9:52 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx, linux-kernel, Tvrtko Ursulin

On Fri, Nov 11, 2016 at 12:36:13PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> With the addition of __sg_alloc_table_from_pages we can control
> the maximum coallescing size and eliminate a separate path for
> allocating backing store here.
> 
> Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
> SWIOTLB max segment size") this enables more compact sg lists to
> be created and so has a beneficial effect on workloads with many
> and/or large objects of this class.
> 
> v2:
>  * Rename helper to i915_sg_segment_size and fix swiotlb override.
>  * Commit message update.
> 
> v3:
>  * Actually include the swiotlb override fix.
> 
> v4:
>  * Regroup parameters a bit. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
(bump)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
  2016-11-11  8:50 ` [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
@ 2016-11-14  9:55   ` Chris Wilson
  2016-11-22 14:31     ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2016-11-14  9:55 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Intel-gfx, Tomasz Stanislawski, Pawel Osciak, linux-kernel,
	Masahiro Yamada, Kyungmin Park, Matt Porter, linux-media,
	Alexandre Bounine, Marek Szyprowski

On Fri, Nov 11, 2016 at 08:50:17AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Scatterlist entries have an unsigned int for the offset so
> correct the sg_alloc_table_from_pages function accordingly.
> 
> Since these are offsets withing a page, unsigned int is
> wide enough.
> 
> Also converts callers which were using unsigned long locally
> with the lower_32_bits annotation to make it explicitly
> clear what is happening.
> 
> v2: Use offset_in_page. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: Pawel Osciak <pawel@osciak.com>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
> Cc: Matt Porter <mporter@kernel.crashing.org>
> Cc: Alexandre Bounine <alexandre.bounine@idt.com>
> Cc: linux-media@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)

If there were kerneldoc, it would nicely explain that having an offset
larger then a page is silly when passing in array of pages.

Changes elsewhere look ok (personally I'd be happy with just
offset_in_page(), 4GiB superpages are somebody else's problem :)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 0/4] Compact userptr object backing store allocation
  2016-11-11  8:50 [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2016-11-11  8:50 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
@ 2016-11-14 11:45 ` Tvrtko Ursulin
  2016-11-28 10:23   ` Tvrtko Ursulin
  4 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-14 11:45 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx, Andrew Morton; +Cc: linux-kernel


Hi Andrew,

On 11/11/2016 08:50, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Userptr backing store with SWIOTBL active is currently allocated in the same
> inefficient manner, with one sg entry per object page, as what the commit
> 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") fixed
> for regular GEM objects.
>
> We can fix that by adding new a __sg_alloc_table_from_pages core function which
> allows us to control the maximum desired coalesced segment size.
>
> Other than that the series starts with two simple fixes to
> sg_alloc_table_from_pages which deal with incorrect data type usage and a
> theoretical overflow condition. Fixing the latter enables easy addition of the
> above mentioned __sg_alloc_table_from_pages.
>
> Tvrtko Ursulin (4):
>   lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
>   lib/scatterlist: Avoid potential scatterlist entry overflow
>   lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
>   drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
>
>  drivers/gpu/drm/i915/i915_drv.h                |  9 +++
>  drivers/gpu/drm/i915/i915_gem.c                | 15 +----
>  drivers/gpu/drm/i915/i915_gem_userptr.c        | 28 ++-------
>  drivers/media/v4l2-core/videobuf2-dma-contig.c |  4 +-
>  drivers/rapidio/devices/rio_mport_cdev.c       |  4 +-
>  include/linux/scatterlist.h                    | 11 ++--
>  lib/scatterlist.c                              | 78 ++++++++++++++++++++------
>  7 files changed, 87 insertions(+), 62 deletions(-)

I have three patches to lib/scatterlist.c in this series which has all 
been reviewed and tested.

We would like to merge them via the DRM tree if there are no objections?

Kind regards,

Tvrtko

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
  2016-11-14  9:55   ` [Intel-gfx] " Chris Wilson
@ 2016-11-22 14:31     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 19+ messages in thread
From: Mauro Carvalho Chehab @ 2016-11-22 14:31 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Tvrtko Ursulin, Intel-gfx, Tomasz Stanislawski, Pawel Osciak,
	linux-kernel, Masahiro Yamada, Kyungmin Park, Matt Porter,
	linux-media, Alexandre Bounine, Marek Szyprowski

Em Mon, 14 Nov 2016 09:55:48 +0000
Chris Wilson <chris@chris-wilson.co.uk> escreveu:

> On Fri, Nov 11, 2016 at 08:50:17AM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > Scatterlist entries have an unsigned int for the offset so
> > correct the sg_alloc_table_from_pages function accordingly.
> > 
> > Since these are offsets withing a page, unsigned int is
> > wide enough.
> > 
> > Also converts callers which were using unsigned long locally
> > with the lower_32_bits annotation to make it explicitly
> > clear what is happening.
> > 
> > v2: Use offset_in_page. (Chris Wilson)
> > 
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> > Cc: Pawel Osciak <pawel@osciak.com>
> > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > Cc: Kyungmin Park <kyungmin.park@samsung.com>
> > Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
> > Cc: Matt Porter <mporter@kernel.crashing.org>
> > Cc: Alexandre Bounine <alexandre.bounine@idt.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)  
> 
> If there were kerneldoc, it would nicely explain that having an offset
> larger then a page is silly when passing in array of pages.
> 
> Changes elsewhere look ok (personally I'd be happy with just
> offset_in_page(), 4GiB superpages are somebody else's problem :)

For the media changes, that looked OK. We don't have any needs
to stream 4GB images nowadays :-)

Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris
> 


Thanks,
Mauro

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 0/4] Compact userptr object backing store allocation
  2016-11-14 11:45 ` [Intel-gfx] [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
@ 2016-11-28 10:23   ` Tvrtko Ursulin
  0 siblings, 0 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 10:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx, Andrew Morton; +Cc: linux-kernel, Chris Wilson


On 14/11/2016 11:45, Tvrtko Ursulin wrote:
>
> Hi Andrew,
>
> On 11/11/2016 08:50, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Userptr backing store with SWIOTBL active is currently allocated in
>> the same
>> inefficient manner, with one sg entry per object page, as what the commit
>> 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment
>> size") fixed
>> for regular GEM objects.
>>
>> We can fix that by adding new a __sg_alloc_table_from_pages core
>> function which
>> allows us to control the maximum desired coalesced segment size.
>>
>> Other than that the series starts with two simple fixes to
>> sg_alloc_table_from_pages which deal with incorrect data type usage and a
>> theoretical overflow condition. Fixing the latter enables easy
>> addition of the
>> above mentioned __sg_alloc_table_from_pages.
>>
>> Tvrtko Ursulin (4):
>>   lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
>>   lib/scatterlist: Avoid potential scatterlist entry overflow
>>   lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
>>   drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
>>
>>  drivers/gpu/drm/i915/i915_drv.h                |  9 +++
>>  drivers/gpu/drm/i915/i915_gem.c                | 15 +----
>>  drivers/gpu/drm/i915/i915_gem_userptr.c        | 28 ++-------
>>  drivers/media/v4l2-core/videobuf2-dma-contig.c |  4 +-
>>  drivers/rapidio/devices/rio_mport_cdev.c       |  4 +-
>>  include/linux/scatterlist.h                    | 11 ++--
>>  lib/scatterlist.c                              | 78
>> ++++++++++++++++++++------
>>  7 files changed, 87 insertions(+), 62 deletions(-)
>
> I have three patches to lib/scatterlist.c in this series which has all
> been reviewed and tested.
>
> We would like to merge them via the DRM tree if there are no objections?

Excuse me for the re-ping - still looking for an ack to merge these few 
patches via the drm tree. (The code in question does not have a 
designated maintainer.)

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
  2017-01-16 14:12 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
@ 2017-01-30  9:44 ` Daniel Vetter
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2017-01-30  9:44 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Intel-gfx, Tomasz Stanislawski, Pawel Osciak, linux-kernel,
	Masahiro Yamada, Kyungmin Park, Matt Porter, linux-media,
	Alexandre Bounine, Marek Szyprowski

Hi all,

Ok if we merge the entire series through drm-intel (likely for 4.12, 4.11
is getting a bit late)? We'd like to use this there, and Mauro already
reviewed the v4l side ...

Thanks, Daniel

On Mon, Jan 16, 2017 at 02:12:07PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Scatterlist entries have an unsigned int for the offset so
> correct the sg_alloc_table_from_pages function accordingly.
> 
> Since these are offsets withing a page, unsigned int is
> wide enough.
> 
> Also converts callers which were using unsigned long locally
> with the lower_32_bits annotation to make it explicitly
> clear what is happening.
> 
> v2: Use offset_in_page. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: Pawel Osciak <pawel@osciak.com>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
> Cc: Matt Porter <mporter@kernel.crashing.org>
> Cc: Alexandre Bounine <alexandre.bounine@idt.com>
> Cc: linux-media@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
> ---
>  drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
>  drivers/rapidio/devices/rio_mport_cdev.c       | 4 ++--
>  include/linux/scatterlist.h                    | 2 +-
>  lib/scatterlist.c                              | 2 +-
>  4 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> index fb6a177be461..51e8765bc3c6 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> @@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
>  {
>  	struct vb2_dc_buf *buf;
>  	struct frame_vector *vec;
> -	unsigned long offset;
> +	unsigned int offset;
>  	int n_pages, i;
>  	int ret = 0;
>  	struct sg_table *sgt;
> @@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
>  	buf->dev = dev;
>  	buf->dma_dir = dma_dir;
>  
> -	offset = vaddr & ~PAGE_MASK;
> +	offset = lower_32_bits(offset_in_page(vaddr));
>  	vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
>  	if (IS_ERR(vec)) {
>  		ret = PTR_ERR(vec);
> diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
> index 9013a585507e..0fae29ff47ba 100644
> --- a/drivers/rapidio/devices/rio_mport_cdev.c
> +++ b/drivers/rapidio/devices/rio_mport_cdev.c
> @@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
>  	 * offset within the internal buffer specified by handle parameter.
>  	 */
>  	if (xfer->loc_addr) {
> -		unsigned long offset;
> +		unsigned int offset;
>  		long pinned;
>  
> -		offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
> +		offset = lower_32_bits(offset_in_page(xfer->loc_addr));
>  		nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
>  
>  		page_list = kmalloc_array(nr_pages,
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index cb3c8fe6acd7..c981bee1a3ae 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
>  int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
>  int sg_alloc_table_from_pages(struct sg_table *sgt,
>  	struct page **pages, unsigned int n_pages,
> -	unsigned long offset, unsigned long size,
> +	unsigned int offset, unsigned long size,
>  	gfp_t gfp_mask);
>  
>  size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index 004fc70fc56a..e05e7fc98892 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
>   */
>  int sg_alloc_table_from_pages(struct sg_table *sgt,
>  	struct page **pages, unsigned int n_pages,
> -	unsigned long offset, unsigned long size,
> +	unsigned int offset, unsigned long size,
>  	gfp_t gfp_mask)
>  {
>  	unsigned int chunks;
> -- 
> 2.7.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-01-30  9:44 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-11  8:50 [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
2016-11-11  8:50 ` [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2016-11-14  9:55   ` [Intel-gfx] " Chris Wilson
2016-11-22 14:31     ` Mauro Carvalho Chehab
2016-11-11  8:50 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
2016-11-11 10:19   ` [Intel-gfx] " Chris Wilson
2016-11-11  8:50 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
2016-11-11 10:24   ` [Intel-gfx] " Chris Wilson
2016-11-11 12:35     ` [PATCH v2 " Tvrtko Ursulin
2016-11-11 14:17     ` [PATCH v3 " Tvrtko Ursulin
2016-11-14  9:51       ` Chris Wilson
2016-11-11  8:50 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
2016-11-11  8:59   ` [PATCH v3 " Tvrtko Ursulin
2016-11-11 10:23   ` [PATCH " Chris Wilson
2016-11-11 12:36     ` [PATCH v4 " Tvrtko Ursulin
2016-11-14  9:52       ` Chris Wilson
2016-11-14 11:45 ` [Intel-gfx] [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
2016-11-28 10:23   ` Tvrtko Ursulin
2017-01-16 14:12 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2017-01-30  9:44 ` [Intel-gfx] " Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).