linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
@ 2017-01-11  9:00 Tvrtko Ursulin
  2017-01-11  9:00 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11  9:00 UTC (permalink / raw)
  To: Intel-gfx
  Cc: tursulin, Tvrtko Ursulin, Masahiro Yamada, Pawel Osciak,
	Marek Szyprowski, Kyungmin Park, Tomasz Stanislawski,
	Matt Porter, Alexandre Bounine, linux-media, linux-kernel

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.

Since these are offsets withing a page, unsigned int is
wide enough.

Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.

v2: Use offset_in_page. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
 drivers/rapidio/devices/rio_mport_cdev.c       | 4 ++--
 include/linux/scatterlist.h                    | 2 +-
 lib/scatterlist.c                              | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index fb6a177be461..51e8765bc3c6 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 {
 	struct vb2_dc_buf *buf;
 	struct frame_vector *vec;
-	unsigned long offset;
+	unsigned int offset;
 	int n_pages, i;
 	int ret = 0;
 	struct sg_table *sgt;
@@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 	buf->dev = dev;
 	buf->dma_dir = dma_dir;
 
-	offset = vaddr & ~PAGE_MASK;
+	offset = lower_32_bits(offset_in_page(vaddr));
 	vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
 	if (IS_ERR(vec)) {
 		ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
index 9013a585507e..0fae29ff47ba 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
 	 * offset within the internal buffer specified by handle parameter.
 	 */
 	if (xfer->loc_addr) {
-		unsigned long offset;
+		unsigned int offset;
 		long pinned;
 
-		offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+		offset = lower_32_bits(offset_in_page(xfer->loc_addr));
 		nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
 
 		page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe6acd7..c981bee1a3ae 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
 int sg_alloc_table_from_pages(struct sg_table *sgt,
 	struct page **pages, unsigned int n_pages,
-	unsigned long offset, unsigned long size,
+	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 004fc70fc56a..e05e7fc98892 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
  */
 int sg_alloc_table_from_pages(struct sg_table *sgt,
 	struct page **pages, unsigned int n_pages,
-	unsigned long offset, unsigned long size,
+	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask)
 {
 	unsigned int chunks;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2017-01-11  9:00 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
@ 2017-01-11  9:00 ` Tvrtko Ursulin
  2017-01-11 11:58   ` [PATCH v4] " Tvrtko Ursulin
  2017-01-11  9:00 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11  9:00 UTC (permalink / raw)
  To: Intel-gfx; +Cc: tursulin, Tvrtko Ursulin, Masahiro Yamada, linux-kernel

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Since the scatterlist length field is an unsigned int, make
sure that sg_alloc_table_from_pages does not overflow it while
coallescing pages to a single entry.

v2: Drop reference to future use. Use UINT_MAX.
v3: max_segment must be page aligned.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
---
 lib/scatterlist.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index e05e7fc98892..4fc54801cd29 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask)
 {
-	unsigned int chunks;
+	const unsigned int max_segment = rounddown(UINT_MAX, PAGE_SIZE);
+	unsigned int seg_len, chunks;
 	unsigned int i;
 	unsigned int cur_page;
 	int ret;
@@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
 	/* compute number of contiguous chunks */
 	chunks = 1;
-	for (i = 1; i < n_pages; ++i)
-		if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
+	seg_len = PAGE_SIZE;
+	for (i = 1; i < n_pages; ++i) {
+		if (seg_len >= max_segment ||
+		    page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
 			++chunks;
+			seg_len = PAGE_SIZE;
+		} else {
+			seg_len += PAGE_SIZE;
+		}
+	}
 
 	ret = sg_alloc_table(sgt, chunks, gfp_mask);
 	if (unlikely(ret))
@@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 	/* merging chunks and putting them into the scatterlist */
 	cur_page = 0;
 	for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
-		unsigned long chunk_size;
+		unsigned int chunk_size;
 		unsigned int j;
 
 		/* look for the end of the current chunk */
+		seg_len = PAGE_SIZE;
 		for (j = cur_page + 1; j < n_pages; ++j)
-			if (page_to_pfn(pages[j]) !=
+			if (seg_len >= max_segment ||
+			    page_to_pfn(pages[j]) !=
 			    page_to_pfn(pages[j - 1]) + 1)
 				break;
+			else
+				seg_len += PAGE_SIZE;
 
 		chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
-		sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
+		sg_set_page(s, pages[cur_page],
+			    min_t(unsigned long, size, chunk_size), offset);
 		size -= chunk_size;
 		offset = 0;
 		cur_page = j;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  2017-01-11  9:00 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
  2017-01-11  9:00 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
@ 2017-01-11  9:00 ` Tvrtko Ursulin
  2017-01-11  9:00 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
  2017-01-11 11:59 ` [PATCH v5] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
  3 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11  9:00 UTC (permalink / raw)
  To: Intel-gfx
  Cc: tursulin, Tvrtko Ursulin, Masahiro Yamada, linux-kernel, Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)
v3: Fix incomplete reordering in v2.
v4: max_segment needs to be page aligned.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
---
 include/linux/scatterlist.h | 11 ++++++---
 lib/scatterlist.c           | 59 +++++++++++++++++++++++++++++++++++----------
 2 files changed, 53 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..16b740afeed2 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 		     struct scatterlist *, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, unsigned int max_segment,
+				gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
 		      size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 4fc54801cd29..df375ff18587 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- *			       an array of pages
- * @sgt:	The sg table header to use
- * @pages:	Pointer to an array of page pointers
- * @n_pages:	Number of pages in the pages array
- * @offset:     Offset from start of the first page to the start of a buffer
- * @size:       Number of valid bytes in the buffer (after offset)
- * @gfp_mask:	GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			         an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a scatterlist node in bytes (page aligned)
+ * @gfp_mask:	 GFP allocation mask
  *
  *  Description:
  *    Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,18 +390,20 @@ EXPORT_SYMBOL(sg_alloc_table);
  * Returns:
  *   0 on success, negative error on failure
  */
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, unsigned int max_segment,
+				gfp_t gfp_mask)
 {
-	const unsigned int max_segment = rounddown(UINT_MAX, PAGE_SIZE);
 	unsigned int seg_len, chunks;
 	unsigned int i;
 	unsigned int cur_page;
 	int ret;
 	struct scatterlist *s;
 
+	if (WARN_ON(offset_in_page(max_segment)))
+		return -EINVAL;
+
 	/* compute number of contiguous chunks */
 	chunks = 1;
 	seg_len = PAGE_SIZE;
@@ -444,6 +447,36 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
 	return 0;
 }
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			       an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:	 GFP allocation mask
+ *
+ *  Description:
+ *    Allocate and initialize an sg table from a list of pages. Contiguous
+ *    ranges of the pages are squashed into a single scatterlist node. A user
+ *    may provide an offset at a start and a size of valid data in a buffer
+ *    specified by the page array. The returned sg table is released by
+ *    sg_free_table.
+ *
+ * Returns:
+ *   0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask)
+{
+	return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset, size,
+					   rounddown(UINT_MAX, PAGE_SIZE),
+					   gfp_mask);
+}
 EXPORT_SYMBOL(sg_alloc_table_from_pages);
 
 void __sg_page_iter_start(struct sg_page_iter *piter,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
  2017-01-11  9:00 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
  2017-01-11  9:00 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
  2017-01-11  9:00 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
@ 2017-01-11  9:00 ` Tvrtko Ursulin
  2017-01-11 11:59   ` [PATCH v6] " Tvrtko Ursulin
  2017-01-11 11:59 ` [PATCH v5] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
  3 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11  9:00 UTC (permalink / raw)
  To: Intel-gfx; +Cc: tursulin, Tvrtko Ursulin, Chris Wilson, linux-kernel

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
 * Rename helper to i915_sg_segment_size and fix swiotlb override.
 * Commit message update.

v3:
 * Actually include the swiotlb override fix.

v4:
 * Regroup parameters a bit. (Chris Wilson)

v5:
 * Rebase for swiotlb_max_segment.
 * Add DMA map failure handling as in abb0deacb5a6
   ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
---
 drivers/gpu/drm/i915/i915_drv.h         | 10 +++++
 drivers/gpu/drm/i915/i915_gem.c         |  6 +--
 drivers/gpu/drm/i915/i915_gem_userptr.c | 79 ++++++++++++---------------------
 3 files changed, 40 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2b325032fedc..a944ff0c5c68 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2594,6 +2594,16 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg)
 	     (((__iter).curr += PAGE_SIZE) < (__iter).max) ||		\
 	     ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
 
+static inline unsigned int i915_sg_segment_size(void)
+{
+	unsigned int size = swiotlb_max_segment();
+
+	if (size == 0)
+		size = UINT_MAX;
+
+	return rounddown(size, PAGE_SIZE);
+}
+
 static inline const struct intel_device_info *
 intel_info(const struct drm_i915_private *dev_priv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 13c02015709c..421827069a2f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2248,7 +2248,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	struct sgt_iter sgt_iter;
 	struct page *page;
 	unsigned long last_pfn = 0;	/* suppress gcc warning */
-	unsigned int max_segment;
+	unsigned int max_segment = i915_sg_segment_size();
 	int ret;
 	gfp_t gfp;
 
@@ -2259,10 +2259,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
 	GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-	max_segment = swiotlb_max_segment();
-	if (!max_segment)
-		max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (st == NULL)
 		return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 6a8fa085b74e..95b62b9c5cd6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,64 +390,42 @@ struct get_pages_work {
 	struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
-	struct scatterlist *sg;
-	int ret, n;
-
-	*st = kmalloc(sizeof(**st), GFP_KERNEL);
-	if (*st == NULL)
-		return -ENOMEM;
-
-	if (swiotlb_active()) {
-		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-		if (ret)
-			goto err;
-
-		for_each_sg((*st)->sgl, sg, num_pages, n)
-			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-	} else {
-		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-						0, num_pages << PAGE_SHIFT,
-						GFP_KERNEL);
-		if (ret)
-			goto err;
-	}
-
-	return 0;
-
-err:
-	kfree(*st);
-	*st = NULL;
-	return ret;
-}
-
 static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
-			     struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+			       struct page **pvec, int num_pages)
 {
-	struct sg_table *pages;
+	unsigned int max_segment = i915_sg_segment_size();
+	struct sg_table *st;
 	int ret;
 
-	ret = st_set_pages(&pages, pvec, num_pages);
-	if (ret)
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (!st)
+		return ERR_PTR(-ENOMEM);
+
+alloc_table:
+	ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
+					  0, num_pages << PAGE_SHIFT,
+					  max_segment,
+					  GFP_KERNEL);
+	if (ret) {
+		kfree(st);
 		return ERR_PTR(ret);
+	}
 
-	ret = i915_gem_gtt_prepare_pages(obj, pages);
+	ret = i915_gem_gtt_prepare_pages(obj, st);
 	if (ret) {
-		sg_free_table(pages);
-		kfree(pages);
+		sg_free_table(st);
+
+		if (max_segment > PAGE_SIZE) {
+			max_segment = PAGE_SIZE;
+			goto alloc_table;
+		}
+
+		kfree(st);
 		return ERR_PTR(ret);
 	}
 
-	return pages;
+	return st;
 }
 
 static int
@@ -531,7 +509,8 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 		struct sg_table *pages = ERR_PTR(ret);
 
 		if (pinned == npages) {
-			pages = __i915_gem_userptr_set_pages(obj, pvec, npages);
+			pages = __i915_gem_userptr_alloc_pages(obj, pvec,
+							       npages);
 			if (!IS_ERR(pages)) {
 				__i915_gem_object_set_pages(obj, pages);
 				pinned = 0;
@@ -653,7 +632,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 	else if (pinned < num_pages)
 		pages = __i915_gem_userptr_get_pages_schedule(obj, &active);
 	else
-		pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages);
+		pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
 	if (IS_ERR(pages)) {
 		__i915_gem_userptr_set_active(obj, active);
 		release_pages(pvec, pinned, 0);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2017-01-11  9:00 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
@ 2017-01-11 11:58   ` Tvrtko Ursulin
  2017-01-11 23:59     ` Andy Shevchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11 11:58 UTC (permalink / raw)
  To: Intel-gfx
  Cc: tursulin, Tvrtko Ursulin, Masahiro Yamada, linux-kernel, Joonas Lahtinen

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Since the scatterlist length field is an unsigned int, make
sure that sg_alloc_table_from_pages does not overflow it while
coallescing pages to a single entry.

v2: Drop reference to future use. Use UINT_MAX.
v3: max_segment must be page aligned.
v4: Do not rely on compiler to optimise out the rounddown.
    (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 include/linux/scatterlist.h |  6 ++++++
 lib/scatterlist.c           | 25 +++++++++++++++++++------
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..15265bb6e5c3 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -21,6 +21,12 @@ struct scatterlist {
 };
 
 /*
+ * Since the above length field is an unsigned int, below we define the maximum
+ * lenght in bytes that can be stored in one scatterlist entry.
+ */
+#define SCATTERLIST_MAX_SEGMENT (0xfffff000)
+
+/*
  * These macros should be used after a dma_map_sg call has been done
  * to get bus addresses of each of the SG entries and their lengths.
  * You should only work with the number of sg entries dma_map_sg
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index e05e7fc98892..24beb0965e69 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 	unsigned int offset, unsigned long size,
 	gfp_t gfp_mask)
 {
-	unsigned int chunks;
+	const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
+	unsigned int seg_len, chunks;
 	unsigned int i;
 	unsigned int cur_page;
 	int ret;
@@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
 	/* compute number of contiguous chunks */
 	chunks = 1;
-	for (i = 1; i < n_pages; ++i)
-		if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
+	seg_len = PAGE_SIZE;
+	for (i = 1; i < n_pages; ++i) {
+		if (seg_len >= max_segment ||
+		    page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
 			++chunks;
+			seg_len = PAGE_SIZE;
+		} else {
+			seg_len += PAGE_SIZE;
+		}
+	}
 
 	ret = sg_alloc_table(sgt, chunks, gfp_mask);
 	if (unlikely(ret))
@@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 	/* merging chunks and putting them into the scatterlist */
 	cur_page = 0;
 	for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
-		unsigned long chunk_size;
+		unsigned int chunk_size;
 		unsigned int j;
 
 		/* look for the end of the current chunk */
+		seg_len = PAGE_SIZE;
 		for (j = cur_page + 1; j < n_pages; ++j)
-			if (page_to_pfn(pages[j]) !=
+			if (seg_len >= max_segment ||
+			    page_to_pfn(pages[j]) !=
 			    page_to_pfn(pages[j - 1]) + 1)
 				break;
+			else
+				seg_len += PAGE_SIZE;
 
 		chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
-		sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
+		sg_set_page(s, pages[cur_page],
+			    min_t(unsigned long, size, chunk_size), offset);
 		size -= chunk_size;
 		offset = 0;
 		cur_page = j;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
  2017-01-11  9:00 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2017-01-11  9:00 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
@ 2017-01-11 11:59 ` Tvrtko Ursulin
  3 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11 11:59 UTC (permalink / raw)
  To: Intel-gfx
  Cc: tursulin, Tvrtko Ursulin, Masahiro Yamada, linux-kernel,
	Chris Wilson, Joonas Lahtinen

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)
v3: Fix incomplete reordering in v2.
v4: max_segment needs to be page aligned.
v5: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 include/linux/scatterlist.h | 11 +++++----
 lib/scatterlist.c           | 58 +++++++++++++++++++++++++++++++++++----------
 2 files changed, 52 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 15265bb6e5c3..c897533fb85d 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -267,10 +267,13 @@ void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 		     struct scatterlist *, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, unsigned int max_segment,
+				gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
 		      size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 24beb0965e69..732c6e516657 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- *			       an array of pages
- * @sgt:	The sg table header to use
- * @pages:	Pointer to an array of page pointers
- * @n_pages:	Number of pages in the pages array
- * @offset:     Offset from start of the first page to the start of a buffer
- * @size:       Number of valid bytes in the buffer (after offset)
- * @gfp_mask:	GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			         an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a scatterlist node in bytes (page aligned)
+ * @gfp_mask:	 GFP allocation mask
  *
  *  Description:
  *    Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,18 +390,20 @@ EXPORT_SYMBOL(sg_alloc_table);
  * Returns:
  *   0 on success, negative error on failure
  */
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-	struct page **pages, unsigned int n_pages,
-	unsigned int offset, unsigned long size,
-	gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+				unsigned int n_pages, unsigned int offset,
+				unsigned long size, unsigned int max_segment,
+				gfp_t gfp_mask)
 {
-	const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
 	unsigned int seg_len, chunks;
 	unsigned int i;
 	unsigned int cur_page;
 	int ret;
 	struct scatterlist *s;
 
+	if (WARN_ON(!max_segment || offset_in_page(max_segment)))
+		return -EINVAL;
+
 	/* compute number of contiguous chunks */
 	chunks = 1;
 	seg_len = PAGE_SIZE;
@@ -444,6 +447,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
 	return 0;
 }
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *			       an array of pages
+ * @sgt:	 The sg table header to use
+ * @pages:	 Pointer to an array of page pointers
+ * @n_pages:	 Number of pages in the pages array
+ * @offset:      Offset from start of the first page to the start of a buffer
+ * @size:        Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:	 GFP allocation mask
+ *
+ *  Description:
+ *    Allocate and initialize an sg table from a list of pages. Contiguous
+ *    ranges of the pages are squashed into a single scatterlist node. A user
+ *    may provide an offset at a start and a size of valid data in a buffer
+ *    specified by the page array. The returned sg table is released by
+ *    sg_free_table.
+ *
+ * Returns:
+ *   0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+			      unsigned int n_pages, unsigned int offset,
+			      unsigned long size, gfp_t gfp_mask)
+{
+	return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset, size,
+					   SCATTERLIST_MAX_SEGMENT, gfp_mask);
+}
 EXPORT_SYMBOL(sg_alloc_table_from_pages);
 
 void __sg_page_iter_start(struct sg_page_iter *piter,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
  2017-01-11  9:00 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
@ 2017-01-11 11:59   ` Tvrtko Ursulin
  2017-01-12 18:20     ` [Intel-gfx] " kbuild test robot
  0 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11 11:59 UTC (permalink / raw)
  To: Intel-gfx
  Cc: tursulin, Tvrtko Ursulin, Chris Wilson, linux-kernel, Joonas Lahtinen

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
 * Rename helper to i915_sg_segment_size and fix swiotlb override.
 * Commit message update.

v3:
 * Actually include the swiotlb override fix.

v4:
 * Regroup parameters a bit. (Chris Wilson)

v5:
 * Rebase for swiotlb_max_segment.
 * Add DMA map failure handling as in abb0deacb5a6
   ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").

v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 15 +++++++
 drivers/gpu/drm/i915/i915_gem.c         |  6 +--
 drivers/gpu/drm/i915/i915_gem_userptr.c | 79 ++++++++++++---------------------
 3 files changed, 45 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a320675a9e71..5646e48a893b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2598,6 +2598,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg)
 	     (((__iter).curr += PAGE_SIZE) < (__iter).max) ||		\
 	     ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
 
+static inline unsigned int i915_sg_segment_size(void)
+{
+	unsigned int size = swiotlb_max_segment();
+
+	if (size == 0)
+		return SCATTERLIST_MAX_SEGMENT;
+
+	size = rounddown(size, PAGE_SIZE);
+	/* swiotlb_max_segment_size can return 1 byte when it means one page. */
+	if (size < PAGE_SIZE)
+		size = PAGE_SIZE;
+
+	return size;
+}
+
 static inline const struct intel_device_info *
 intel_info(const struct drm_i915_private *dev_priv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3bf517e2430a..9312284a31e4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2255,7 +2255,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	struct sgt_iter sgt_iter;
 	struct page *page;
 	unsigned long last_pfn = 0;	/* suppress gcc warning */
-	unsigned int max_segment;
+	unsigned int max_segment = i915_sg_segment_size();
 	int ret;
 	gfp_t gfp;
 
@@ -2266,10 +2266,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
 	GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-	max_segment = swiotlb_max_segment();
-	if (!max_segment)
-		max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (st == NULL)
 		return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 6a8fa085b74e..95b62b9c5cd6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,64 +390,42 @@ struct get_pages_work {
 	struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
-	struct scatterlist *sg;
-	int ret, n;
-
-	*st = kmalloc(sizeof(**st), GFP_KERNEL);
-	if (*st == NULL)
-		return -ENOMEM;
-
-	if (swiotlb_active()) {
-		ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-		if (ret)
-			goto err;
-
-		for_each_sg((*st)->sgl, sg, num_pages, n)
-			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-	} else {
-		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-						0, num_pages << PAGE_SHIFT,
-						GFP_KERNEL);
-		if (ret)
-			goto err;
-	}
-
-	return 0;
-
-err:
-	kfree(*st);
-	*st = NULL;
-	return ret;
-}
-
 static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
-			     struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+			       struct page **pvec, int num_pages)
 {
-	struct sg_table *pages;
+	unsigned int max_segment = i915_sg_segment_size();
+	struct sg_table *st;
 	int ret;
 
-	ret = st_set_pages(&pages, pvec, num_pages);
-	if (ret)
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (!st)
+		return ERR_PTR(-ENOMEM);
+
+alloc_table:
+	ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
+					  0, num_pages << PAGE_SHIFT,
+					  max_segment,
+					  GFP_KERNEL);
+	if (ret) {
+		kfree(st);
 		return ERR_PTR(ret);
+	}
 
-	ret = i915_gem_gtt_prepare_pages(obj, pages);
+	ret = i915_gem_gtt_prepare_pages(obj, st);
 	if (ret) {
-		sg_free_table(pages);
-		kfree(pages);
+		sg_free_table(st);
+
+		if (max_segment > PAGE_SIZE) {
+			max_segment = PAGE_SIZE;
+			goto alloc_table;
+		}
+
+		kfree(st);
 		return ERR_PTR(ret);
 	}
 
-	return pages;
+	return st;
 }
 
 static int
@@ -531,7 +509,8 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 		struct sg_table *pages = ERR_PTR(ret);
 
 		if (pinned == npages) {
-			pages = __i915_gem_userptr_set_pages(obj, pvec, npages);
+			pages = __i915_gem_userptr_alloc_pages(obj, pvec,
+							       npages);
 			if (!IS_ERR(pages)) {
 				__i915_gem_object_set_pages(obj, pages);
 				pinned = 0;
@@ -653,7 +632,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 	else if (pinned < num_pages)
 		pages = __i915_gem_userptr_get_pages_schedule(obj, &active);
 	else
-		pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages);
+		pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
 	if (IS_ERR(pages)) {
 		__i915_gem_userptr_set_active(obj, active);
 		release_pages(pvec, pinned, 0);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2017-01-11 11:58   ` [PATCH v4] " Tvrtko Ursulin
@ 2017-01-11 23:59     ` Andy Shevchenko
  2017-01-13  8:37       ` [Intel-gfx] " Tvrtko Ursulin
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Shevchenko @ 2017-01-11 23:59 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Intel-gfx, Tvrtko Ursulin, Masahiro Yamada, linux-kernel,
	Joonas Lahtinen

On Wed, Jan 11, 2017 at 1:58 PM, Tvrtko Ursulin <tursulin@ursulin.net> wrote:
> Since the scatterlist length field is an unsigned int, make
> sure that sg_alloc_table_from_pages does not overflow it while
> coallescing pages to a single entry.


>  /*
> + * Since the above length field is an unsigned int, below we define the maximum
> + * lenght in bytes that can be stored in one scatterlist entry.

length

> + */
> +#define SCATTERLIST_MAX_SEGMENT (0xfffff000)

Shouldn't be calculated from PAGE_SIZE (PAGE bits, etc)?

> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c

> @@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>
>         /* compute number of contiguous chunks */
>         chunks = 1;
> -       for (i = 1; i < n_pages; ++i)
> -               if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
> +       seg_len = PAGE_SIZE;
> +       for (i = 1; i < n_pages; ++i) {
> +               if (seg_len >= max_segment ||
> +                   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
>                         ++chunks;
> +                       seg_len = PAGE_SIZE;
> +               } else {
> +                       seg_len += PAGE_SIZE;
> +               }
> +       }

Wouldn't be following looks more readable?

seg_len = 0;
// Are compilers so stupid doing calculation per iteration in for-conditional?
// for (i = 0; i + 1 < n_pages; i++) ?
for (i = 1; i < n_pages; ++i) {
  seg_len += PAGE_SIZE;
  if (seg_len >= max_segment || page_to_pfn(pages[i]) !=
page_to_pfn(pages[i - 1]) + 1) {
    ++chunks;
    seg_len = PAGE_SIZE;
  }
}

Perhaps while() or do-while() will increase readability even more, but
I didn't check.

>                 /* look for the end of the current chunk */
> +               seg_len = PAGE_SIZE;
>                 for (j = cur_page + 1; j < n_pages; ++j)
> -                       if (page_to_pfn(pages[j]) !=
> +                       if (seg_len >= max_segment ||
> +                           page_to_pfn(pages[j]) !=
>                             page_to_pfn(pages[j - 1]) + 1)
>                                 break;
> +                       else
> +                               seg_len += PAGE_SIZE;

Something similar here (didn't you get warning from checkpath about
curly braces?).

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx] [PATCH v6] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
  2017-01-11 11:59   ` [PATCH v6] " Tvrtko Ursulin
@ 2017-01-12 18:20     ` kbuild test robot
  0 siblings, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2017-01-12 18:20 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: kbuild-all, Intel-gfx, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2550 bytes --]

Hi Tvrtko,

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20170111]
[cannot apply to v4.10-rc3]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Tvrtko-Ursulin/drm-i915-Use-__sg_alloc_table_from_pages-for-userptr-allocations/20170113-004619
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-rhel (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   In file included from drivers/gpu/drm/i915/i915_drv.c:48:0:
   drivers/gpu/drm/i915/i915_drv.h: In function 'i915_sg_segment_size':
>> drivers/gpu/drm/i915/i915_drv.h:2605:10: error: 'SCATTERLIST_MAX_SEGMENT' undeclared (first use in this function)
      return SCATTERLIST_MAX_SEGMENT;
             ^~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h:2605:10: note: each undeclared identifier is reported only once for each function it appears in
--
   In file included from drivers/gpu/drm/i915/i915_gem_userptr.c:27:0:
   drivers/gpu/drm/i915/i915_drv.h: In function 'i915_sg_segment_size':
>> drivers/gpu/drm/i915/i915_drv.h:2605:10: error: 'SCATTERLIST_MAX_SEGMENT' undeclared (first use in this function)
      return SCATTERLIST_MAX_SEGMENT;
             ^~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h:2605:10: note: each undeclared identifier is reported only once for each function it appears in
   drivers/gpu/drm/i915/i915_gem_userptr.c: In function '__i915_gem_userptr_alloc_pages':
>> drivers/gpu/drm/i915/i915_gem_userptr.c:406:8: error: implicit declaration of function '__sg_alloc_table_from_pages' [-Werror=implicit-function-declaration]
     ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/SCATTERLIST_MAX_SEGMENT +2605 drivers/gpu/drm/i915/i915_drv.h

  2599	
  2600	static inline unsigned int i915_sg_segment_size(void)
  2601	{
  2602		unsigned int size = swiotlb_max_segment();
  2603	
  2604		if (size == 0)
> 2605			return SCATTERLIST_MAX_SEGMENT;
  2606	
  2607		size = rounddown(size, PAGE_SIZE);
  2608		/* swiotlb_max_segment_size can return 1 byte when it means one page. */

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38329 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx] [PATCH v4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2017-01-11 23:59     ` Andy Shevchenko
@ 2017-01-13  8:37       ` Tvrtko Ursulin
  2017-01-13 22:23         ` Andy Shevchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-13  8:37 UTC (permalink / raw)
  To: Andy Shevchenko, Tvrtko Ursulin
  Cc: Masahiro Yamada, Intel-gfx, linux-kernel, Joonas Lahtinen


Hi,

On 11/01/2017 23:59, Andy Shevchenko wrote:
> On Wed, Jan 11, 2017 at 1:58 PM, Tvrtko Ursulin <tursulin@ursulin.net> wrote:
>> Since the scatterlist length field is an unsigned int, make
>> sure that sg_alloc_table_from_pages does not overflow it while
>> coallescing pages to a single entry.
>
>
>>  /*
>> + * Since the above length field is an unsigned int, below we define the maximum
>> + * lenght in bytes that can be stored in one scatterlist entry.
>
> length
>
>> + */
>> +#define SCATTERLIST_MAX_SEGMENT (0xfffff000)
>
> Shouldn't be calculated from PAGE_SIZE (PAGE bits, etc)?

Yep, and at the same time I would potentially change the name to be 
consistent with the other defines in the file as Joonas suggested. 
Something like:

#define SG_MAX_SEGMENT (UINT_MAX & PAGE_MASK)

But would need a better name since SG_MAX_SEGMENT*S* already exists and 
means something else. If we can't come up with a better name then leave 
it as SCATTERLIST_MAX_SEGMENT?

>> --- a/lib/scatterlist.c
>> +++ b/lib/scatterlist.c
>
>> @@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>>
>>         /* compute number of contiguous chunks */
>>         chunks = 1;
>> -       for (i = 1; i < n_pages; ++i)
>> -               if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
>> +       seg_len = PAGE_SIZE;
>> +       for (i = 1; i < n_pages; ++i) {
>> +               if (seg_len >= max_segment ||
>> +                   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
>>                         ++chunks;
>> +                       seg_len = PAGE_SIZE;
>> +               } else {
>> +                       seg_len += PAGE_SIZE;
>> +               }
>> +       }
>
> Wouldn't be following looks more readable?
>
> seg_len = 0;
> // Are compilers so stupid doing calculation per iteration in for-conditional?
> // for (i = 0; i + 1 < n_pages; i++) ?

I didn't get what you meant here?

> for (i = 1; i < n_pages; ++i) {
>   seg_len += PAGE_SIZE;
>   if (seg_len >= max_segment || page_to_pfn(pages[i]) !=
> page_to_pfn(pages[i - 1]) + 1) {
>     ++chunks;
>     seg_len = PAGE_SIZE;
>   }
> }

Tried it in my unit tester but it doesn't work for all scenarios, guess 
there is a subtle bug somewhere. I don't find it that unreadable so 
would prefer to leave it since it works.

> Perhaps while() or do-while() will increase readability even more, but
> I didn't check.
>
>>                 /* look for the end of the current chunk */
>> +               seg_len = PAGE_SIZE;
>>                 for (j = cur_page + 1; j < n_pages; ++j)
>> -                       if (page_to_pfn(pages[j]) !=
>> +                       if (seg_len >= max_segment ||
>> +                           page_to_pfn(pages[j]) !=
>>                             page_to_pfn(pages[j - 1]) + 1)
>>                                 break;
>> +                       else
>> +                               seg_len += PAGE_SIZE;
>
> Something similar here (didn't you get warning from checkpath about
> curly braces?).

It didn't, but agreed that I should have added them. Will do.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx] [PATCH v4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2017-01-13  8:37       ` [Intel-gfx] " Tvrtko Ursulin
@ 2017-01-13 22:23         ` Andy Shevchenko
  2017-01-16 10:05           ` Tvrtko Ursulin
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Shevchenko @ 2017-01-13 22:23 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Tvrtko Ursulin, Masahiro Yamada, Intel-gfx, linux-kernel,
	Joonas Lahtinen

>>> @@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>>>
>>>         /* compute number of contiguous chunks */
>>>         chunks = 1;
>>> -       for (i = 1; i < n_pages; ++i)
>>> -               if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) +
>>> 1)
>>> +       seg_len = PAGE_SIZE;
>>> +       for (i = 1; i < n_pages; ++i) {
>>> +               if (seg_len >= max_segment ||
>>> +                   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) +
>>> 1) {
>>>                         ++chunks;
>>> +                       seg_len = PAGE_SIZE;
>>> +               } else {
>>> +                       seg_len += PAGE_SIZE;
>>> +               }
>>> +       }
>>
>>
>> Wouldn't be following looks more readable?
>>
>> seg_len = 0;
>> // Are compilers so stupid doing calculation per iteration in
>> for-conditional?
>> // for (i = 0; i + 1 < n_pages; i++) ?
>
>
> I didn't get what you meant here?

Why do we start from 1? I see here two micro (?) optimizations:
1) starting from 1 on believe that compiler dumb enough to every time
do a calculation in condition;
2) ++i instead of i++, but this is just matter of style, it's not a c++.

>> for (i = 1; i < n_pages; ++i) {
>>   seg_len += PAGE_SIZE;
>>   if (seg_len >= max_segment || page_to_pfn(pages[i]) !=
>> page_to_pfn(pages[i - 1]) + 1) {
>>     ++chunks;
>>     seg_len = PAGE_SIZE;
>>   }
>> }
>
>
> Tried it in my unit tester but it doesn't work for all scenarios, guess
> there is a subtle bug somewhere. I don't find it that unreadable so would
> prefer to leave it since it works.

Last seems has to be
seg_len = 0;

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx] [PATCH v4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2017-01-13 22:23         ` Andy Shevchenko
@ 2017-01-16 10:05           ` Tvrtko Ursulin
  2017-01-16 10:12             ` Andy Shevchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2017-01-16 10:05 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Tvrtko Ursulin, Masahiro Yamada, Intel-gfx, linux-kernel,
	Joonas Lahtinen


On 13/01/2017 22:23, Andy Shevchenko wrote:
>>>> @@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>>>>
>>>>         /* compute number of contiguous chunks */
>>>>         chunks = 1;
>>>> -       for (i = 1; i < n_pages; ++i)
>>>> -               if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) +
>>>> 1)
>>>> +       seg_len = PAGE_SIZE;
>>>> +       for (i = 1; i < n_pages; ++i) {
>>>> +               if (seg_len >= max_segment ||
>>>> +                   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) +
>>>> 1) {
>>>>                         ++chunks;
>>>> +                       seg_len = PAGE_SIZE;
>>>> +               } else {
>>>> +                       seg_len += PAGE_SIZE;
>>>> +               }
>>>> +       }
>>>
>>>
>>> Wouldn't be following looks more readable?
>>>
>>> seg_len = 0;
>>> // Are compilers so stupid doing calculation per iteration in
>>> for-conditional?
>>> // for (i = 0; i + 1 < n_pages; i++) ?
>>
>>
>> I didn't get what you meant here?
>
> Why do we start from 1? I see here two micro (?) optimizations:
> 1) starting from 1 on believe that compiler dumb enough to every time
> do a calculation in condition;

The existing code starts from 1 because the pfn condition looks up page 
i - 1. I don't feel there is a need to change that as well.

> 2) ++i instead of i++, but this is just matter of style, it's not a c++.

Note that I haven't changed the existing code in this respect. I am 
happy to change it though.

>>> for (i = 1; i < n_pages; ++i) {
>>>   seg_len += PAGE_SIZE;
>>>   if (seg_len >= max_segment || page_to_pfn(pages[i]) !=
>>> page_to_pfn(pages[i - 1]) + 1) {
>>>     ++chunks;
>>>     seg_len = PAGE_SIZE;
>>>   }
>>> }
>>
>>
>> Tried it in my unit tester but it doesn't work for all scenarios, guess
>> there is a subtle bug somewhere. I don't find it that unreadable so would
>> prefer to leave it since it works.
>
> Last seems has to be
> seg_len = 0;

Oh right, of course. Your suggestion generates a tiny bit smaller binary 
so I am happy to change that as well. I'll resend the patch hopefully today.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx] [PATCH v4] lib/scatterlist: Avoid potential scatterlist entry overflow
  2017-01-16 10:05           ` Tvrtko Ursulin
@ 2017-01-16 10:12             ` Andy Shevchenko
  0 siblings, 0 replies; 13+ messages in thread
From: Andy Shevchenko @ 2017-01-16 10:12 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Tvrtko Ursulin, Masahiro Yamada, Intel-gfx, linux-kernel,
	Joonas Lahtinen

On Mon, Jan 16, 2017 at 12:05 PM, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>>>>> -       for (i = 1; i < n_pages; ++i)
>>>>> +       for (i = 1; i < n_pages; ++i) {

>>>> // Are compilers so stupid doing calculation per iteration in
>>>> for-conditional?
>>>> // for (i = 0; i + 1 < n_pages; i++) ?

>>> I didn't get what you meant here?

>> Why do we start from 1? I see here two micro (?) optimizations:
>> 1) starting from 1 on believe that compiler dumb enough to every time
>> do a calculation in condition;

> The existing code starts from 1 because the pfn condition looks up page i -
> 1. I don't feel there is a need to change that as well.

>> 2) ++i instead of i++, but this is just matter of style, it's not a c++.

> Note that I haven't changed the existing code in this respect. I am happy to
> change it though.

Yes, this is another story. Just a side note to existing code, indeed.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-01-16 10:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-11  9:00 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2017-01-11  9:00 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
2017-01-11 11:58   ` [PATCH v4] " Tvrtko Ursulin
2017-01-11 23:59     ` Andy Shevchenko
2017-01-13  8:37       ` [Intel-gfx] " Tvrtko Ursulin
2017-01-13 22:23         ` Andy Shevchenko
2017-01-16 10:05           ` Tvrtko Ursulin
2017-01-16 10:12             ` Andy Shevchenko
2017-01-11  9:00 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
2017-01-11  9:00 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
2017-01-11 11:59   ` [PATCH v6] " Tvrtko Ursulin
2017-01-12 18:20     ` [Intel-gfx] " kbuild test robot
2017-01-11 11:59 ` [PATCH v5] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).