All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
@ 2018-10-16 15:04 Ville Syrjala
  2018-10-16 16:53 ` ✗ Fi.CI.SPARSE: warning for " Patchwork
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Ville Syrjala @ 2018-10-16 15:04 UTC (permalink / raw)
  To: intel-gfx

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Replace the kvmalloc_array() with i915_gem_object_get_dma_address() when
populating rotated vmas. One random access mechanism ought to be enough
for everyone?

To calculate the size of the radix tree I think we can do
something like this (assuming 64bit pointers):
 num_pages = obj_size / 4096
 tree_height = ceil(log64(num_pages))
 num_nodes = sum(64^n, n, 0, tree_height-1)
 tree_size = num_nodes * 576

If we compare that with the object size we should get a relative
overhead of around .2% to 1% for reasonable sized objects,
which framebuffers tend to be.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 31 ++++++-----------------------
 1 file changed, 6 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 29ca9007a704..98d9a1eb1ed2 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3637,7 +3637,7 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
 }
 
 static struct scatterlist *
-rotate_pages(const dma_addr_t *in, unsigned int offset,
+rotate_pages(struct drm_i915_gem_object *obj, unsigned int offset,
 	     unsigned int width, unsigned int height,
 	     unsigned int stride,
 	     struct sg_table *st, struct scatterlist *sg)
@@ -3646,7 +3646,7 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
 	unsigned int src_idx;
 
 	for (column = 0; column < width; column++) {
-		src_idx = stride * (height - 1) + column;
+		src_idx = stride * (height - 1) + column + offset;
 		for (row = 0; row < height; row++) {
 			st->nents++;
 			/* We don't need the pages, but need to initialize
@@ -3654,7 +3654,8 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
 			 * The only thing we need are DMA addresses.
 			 */
 			sg_set_page(sg, NULL, I915_GTT_PAGE_SIZE, 0);
-			sg_dma_address(sg) = in[offset + src_idx];
+			sg_dma_address(sg) =
+				i915_gem_object_get_dma_address(obj, src_idx);
 			sg_dma_len(sg) = I915_GTT_PAGE_SIZE;
 			sg = sg_next(sg);
 			src_idx -= stride;
@@ -3668,22 +3669,11 @@ static noinline struct sg_table *
 intel_rotate_pages(struct intel_rotation_info *rot_info,
 		   struct drm_i915_gem_object *obj)
 {
-	const unsigned long n_pages = obj->base.size / I915_GTT_PAGE_SIZE;
 	unsigned int size = intel_rotation_info_size(rot_info);
-	struct sgt_iter sgt_iter;
-	dma_addr_t dma_addr;
-	unsigned long i;
-	dma_addr_t *page_addr_list;
 	struct sg_table *st;
 	struct scatterlist *sg;
 	int ret = -ENOMEM;
-
-	/* Allocate a temporary list of source pages for random access. */
-	page_addr_list = kvmalloc_array(n_pages,
-					sizeof(dma_addr_t),
-					GFP_KERNEL);
-	if (!page_addr_list)
-		return ERR_PTR(ret);
+	int i;
 
 	/* Allocate target SG list. */
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
@@ -3694,29 +3684,20 @@ intel_rotate_pages(struct intel_rotation_info *rot_info,
 	if (ret)
 		goto err_sg_alloc;
 
-	/* Populate source page list from the object. */
-	i = 0;
-	for_each_sgt_dma(dma_addr, sgt_iter, obj->mm.pages)
-		page_addr_list[i++] = dma_addr;
-
-	GEM_BUG_ON(i != n_pages);
 	st->nents = 0;
 	sg = st->sgl;
 
 	for (i = 0 ; i < ARRAY_SIZE(rot_info->plane); i++) {
-		sg = rotate_pages(page_addr_list, rot_info->plane[i].offset,
+		sg = rotate_pages(obj, rot_info->plane[i].offset,
 				  rot_info->plane[i].width, rot_info->plane[i].height,
 				  rot_info->plane[i].stride, st, sg);
 	}
 
-	kvfree(page_addr_list);
-
 	return st;
 
 err_sg_alloc:
 	kfree(st);
 err_st_alloc:
-	kvfree(page_addr_list);
 
 	DRM_DEBUG_DRIVER("Failed to create rotated mapping for object size %zu! (%ux%u tiles, %u pages)\n",
 			 obj->base.size, rot_info->plane[0].width, rot_info->plane[0].height, size);
-- 
2.18.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* ✗ Fi.CI.SPARSE: warning for drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
  2018-10-16 15:04 [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas Ville Syrjala
@ 2018-10-16 16:53 ` Patchwork
  2018-10-16 17:13 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2018-10-16 16:53 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
URL   : https://patchwork.freedesktop.org/series/51072/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
-./include/linux/mm.h:592:13: error: undefined identifier '__builtin_mul_overflow'
-./include/linux/mm.h:592:13: warning: call with no type!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
  2018-10-16 15:04 [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas Ville Syrjala
  2018-10-16 16:53 ` ✗ Fi.CI.SPARSE: warning for " Patchwork
@ 2018-10-16 17:13 ` Patchwork
  2018-10-16 17:54 ` [PATCH] " Tvrtko Ursulin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2018-10-16 17:13 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
URL   : https://patchwork.freedesktop.org/series/51072/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4990 -> Patchwork_10476 =

== Summary - SUCCESS ==

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/51072/revisions/1/mbox/

== Known issues ==

  Here are the changes found in Patchwork_10476 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_exec_suspend@basic-s3:
      fi-kbl-soraka:      NOTRUN -> INCOMPLETE (fdo#107556, fdo#107859, fdo#107774)

    igt@kms_pipe_crc_basic@nonblocking-crc-pipe-b:
      fi-byt-clapper:     PASS -> FAIL (fdo#107362)

    igt@pm_rpm@module-reload:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#107726)

    
    ==== Possible fixes ====

    igt@drv_selftest@live_hangcheck:
      fi-icl-u2:          INCOMPLETE (fdo#108315) -> PASS

    igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
      fi-glk-j4005:       FAIL (fdo#106765) -> PASS

    igt@kms_flip@basic-flip-vs-dpms:
      fi-hsw-4770r:       DMESG-WARN (fdo#105602) -> PASS

    igt@kms_flip@basic-plain-flip:
      fi-glk-j4005:       DMESG-WARN (fdo#106097) -> PASS

    igt@kms_frontbuffer_tracking@basic:
      fi-icl-u2:          FAIL (fdo#103167) -> PASS
      fi-byt-clapper:     FAIL (fdo#103167) -> PASS

    
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#105602 https://bugs.freedesktop.org/show_bug.cgi?id=105602
  fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097
  fdo#106765 https://bugs.freedesktop.org/show_bug.cgi?id=106765
  fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362
  fdo#107556 https://bugs.freedesktop.org/show_bug.cgi?id=107556
  fdo#107726 https://bugs.freedesktop.org/show_bug.cgi?id=107726
  fdo#107774 https://bugs.freedesktop.org/show_bug.cgi?id=107774
  fdo#107859 https://bugs.freedesktop.org/show_bug.cgi?id=107859
  fdo#108315 https://bugs.freedesktop.org/show_bug.cgi?id=108315


== Participating hosts (46 -> 44) ==

  Additional (2): fi-kbl-soraka fi-icl-u 
  Missing    (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u 


== Build changes ==

    * Linux: CI_DRM_4990 -> Patchwork_10476

  CI_DRM_4990: 0bd34d92e9a27784cb988a300619f497ca0e99ec @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4681: 959d6f95cb1344e0c0dace5b236e17755826fac1 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10476: 250009151f8d3d9243a6ba13953c1c86a3f38360 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

250009151f8d drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10476/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
  2018-10-16 15:04 [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas Ville Syrjala
  2018-10-16 16:53 ` ✗ Fi.CI.SPARSE: warning for " Patchwork
  2018-10-16 17:13 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-10-16 17:54 ` Tvrtko Ursulin
  2018-10-16 18:18   ` Ville Syrjälä
  2018-10-16 20:43 ` ✓ Fi.CI.IGT: success for " Patchwork
  2018-10-18 14:52 ` [PATCH] " Tvrtko Ursulin
  4 siblings, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2018-10-16 17:54 UTC (permalink / raw)
  To: Ville Syrjala, intel-gfx


On 16/10/2018 16:04, Ville Syrjala wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Replace the kvmalloc_array() with i915_gem_object_get_dma_address() when
> populating rotated vmas. One random access mechanism ought to be enough
> for everyone?
> 
> To calculate the size of the radix tree I think we can do
> something like this (assuming 64bit pointers):
>   num_pages = obj_size / 4096
>   tree_height = ceil(log64(num_pages))
>   num_nodes = sum(64^n, n, 0, tree_height-1)
>   tree_size = num_nodes * 576

For 1920x1080x4 I get around 19k in real life - does that match your 
math? For something like two UHD screens with double buffering I guess 
that's around 4 * 2 * 2 = 16 times more, which is perhaps a worst case 
setup? So ~300KiB permanently pinned and likely not used post first pin 
to display. If that is OK with you then you have my ack for the patch. 
Or a review if you can live with some delay.

Regards,

Tvrtko

> If we compare that with the object size we should get a relative
> overhead of around .2% to 1% for reasonable sized objects,
> which framebuffers tend to be.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_gtt.c | 31 ++++++-----------------------
>   1 file changed, 6 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 29ca9007a704..98d9a1eb1ed2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -3637,7 +3637,7 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
>   }
>   
>   static struct scatterlist *
> -rotate_pages(const dma_addr_t *in, unsigned int offset,
> +rotate_pages(struct drm_i915_gem_object *obj, unsigned int offset,
>   	     unsigned int width, unsigned int height,
>   	     unsigned int stride,
>   	     struct sg_table *st, struct scatterlist *sg)
> @@ -3646,7 +3646,7 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
>   	unsigned int src_idx;
>   
>   	for (column = 0; column < width; column++) {
> -		src_idx = stride * (height - 1) + column;
> +		src_idx = stride * (height - 1) + column + offset;
>   		for (row = 0; row < height; row++) {
>   			st->nents++;
>   			/* We don't need the pages, but need to initialize
> @@ -3654,7 +3654,8 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
>   			 * The only thing we need are DMA addresses.
>   			 */
>   			sg_set_page(sg, NULL, I915_GTT_PAGE_SIZE, 0);
> -			sg_dma_address(sg) = in[offset + src_idx];
> +			sg_dma_address(sg) =
> +				i915_gem_object_get_dma_address(obj, src_idx);
>   			sg_dma_len(sg) = I915_GTT_PAGE_SIZE;
>   			sg = sg_next(sg);
>   			src_idx -= stride;
> @@ -3668,22 +3669,11 @@ static noinline struct sg_table *
>   intel_rotate_pages(struct intel_rotation_info *rot_info,
>   		   struct drm_i915_gem_object *obj)
>   {
> -	const unsigned long n_pages = obj->base.size / I915_GTT_PAGE_SIZE;
>   	unsigned int size = intel_rotation_info_size(rot_info);
> -	struct sgt_iter sgt_iter;
> -	dma_addr_t dma_addr;
> -	unsigned long i;
> -	dma_addr_t *page_addr_list;
>   	struct sg_table *st;
>   	struct scatterlist *sg;
>   	int ret = -ENOMEM;
> -
> -	/* Allocate a temporary list of source pages for random access. */
> -	page_addr_list = kvmalloc_array(n_pages,
> -					sizeof(dma_addr_t),
> -					GFP_KERNEL);
> -	if (!page_addr_list)
> -		return ERR_PTR(ret);
> +	int i;
>   
>   	/* Allocate target SG list. */
>   	st = kmalloc(sizeof(*st), GFP_KERNEL);
> @@ -3694,29 +3684,20 @@ intel_rotate_pages(struct intel_rotation_info *rot_info,
>   	if (ret)
>   		goto err_sg_alloc;
>   
> -	/* Populate source page list from the object. */
> -	i = 0;
> -	for_each_sgt_dma(dma_addr, sgt_iter, obj->mm.pages)
> -		page_addr_list[i++] = dma_addr;
> -
> -	GEM_BUG_ON(i != n_pages);
>   	st->nents = 0;
>   	sg = st->sgl;
>   
>   	for (i = 0 ; i < ARRAY_SIZE(rot_info->plane); i++) {
> -		sg = rotate_pages(page_addr_list, rot_info->plane[i].offset,
> +		sg = rotate_pages(obj, rot_info->plane[i].offset,
>   				  rot_info->plane[i].width, rot_info->plane[i].height,
>   				  rot_info->plane[i].stride, st, sg);
>   	}
>   
> -	kvfree(page_addr_list);
> -
>   	return st;
>   
>   err_sg_alloc:
>   	kfree(st);
>   err_st_alloc:
> -	kvfree(page_addr_list);
>   
>   	DRM_DEBUG_DRIVER("Failed to create rotated mapping for object size %zu! (%ux%u tiles, %u pages)\n",
>   			 obj->base.size, rot_info->plane[0].width, rot_info->plane[0].height, size);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
  2018-10-16 17:54 ` [PATCH] " Tvrtko Ursulin
@ 2018-10-16 18:18   ` Ville Syrjälä
  0 siblings, 0 replies; 8+ messages in thread
From: Ville Syrjälä @ 2018-10-16 18:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Tue, Oct 16, 2018 at 06:54:59PM +0100, Tvrtko Ursulin wrote:
> 
> On 16/10/2018 16:04, Ville Syrjala wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Replace the kvmalloc_array() with i915_gem_object_get_dma_address() when
> > populating rotated vmas. One random access mechanism ought to be enough
> > for everyone?
> > 
> > To calculate the size of the radix tree I think we can do
> > something like this (assuming 64bit pointers):
> >   num_pages = obj_size / 4096
> >   tree_height = ceil(log64(num_pages))
> >   num_nodes = sum(64^n, n, 0, tree_height-1)
> >   tree_size = num_nodes * 576
> 
> For 1920x1080x4 I get around 19k in real life - does that match your 
> math?

The math assumes the second level of tree is fully populated so
gives me ~37KiB. In this case we have only 2025 pages so we should
only need 33 nodes in total instead of the full 65. 33*576 is ~19KiB.

> For something like two UHD screens with double buffering I guess 
> that's around 4 * 2 * 2 = 16 times more, which is perhaps a worst case 
> setup? So ~300KiB permanently pinned and likely not used post first pin 
> to display. If that is OK with you then you have my ack for the patch. 

I think I'm OK with it. And Chris can optimize it later ;)

If we're really short on memory the shrinker should be able to get
rid of this when it's going to swap out the pages. For any fb that's
not currently pinned that is.

> Or a review if you can live with some delay.
> 
> Regards,
> 
> Tvrtko
> 
> > If we compare that with the object size we should get a relative
> > overhead of around .2% to 1% for reasonable sized objects,
> > which framebuffers tend to be.
> > 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem_gtt.c | 31 ++++++-----------------------
> >   1 file changed, 6 insertions(+), 25 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 29ca9007a704..98d9a1eb1ed2 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -3637,7 +3637,7 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
> >   }
> >   
> >   static struct scatterlist *
> > -rotate_pages(const dma_addr_t *in, unsigned int offset,
> > +rotate_pages(struct drm_i915_gem_object *obj, unsigned int offset,
> >   	     unsigned int width, unsigned int height,
> >   	     unsigned int stride,
> >   	     struct sg_table *st, struct scatterlist *sg)
> > @@ -3646,7 +3646,7 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
> >   	unsigned int src_idx;
> >   
> >   	for (column = 0; column < width; column++) {
> > -		src_idx = stride * (height - 1) + column;
> > +		src_idx = stride * (height - 1) + column + offset;
> >   		for (row = 0; row < height; row++) {
> >   			st->nents++;
> >   			/* We don't need the pages, but need to initialize
> > @@ -3654,7 +3654,8 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
> >   			 * The only thing we need are DMA addresses.
> >   			 */
> >   			sg_set_page(sg, NULL, I915_GTT_PAGE_SIZE, 0);
> > -			sg_dma_address(sg) = in[offset + src_idx];
> > +			sg_dma_address(sg) =
> > +				i915_gem_object_get_dma_address(obj, src_idx);
> >   			sg_dma_len(sg) = I915_GTT_PAGE_SIZE;
> >   			sg = sg_next(sg);
> >   			src_idx -= stride;
> > @@ -3668,22 +3669,11 @@ static noinline struct sg_table *
> >   intel_rotate_pages(struct intel_rotation_info *rot_info,
> >   		   struct drm_i915_gem_object *obj)
> >   {
> > -	const unsigned long n_pages = obj->base.size / I915_GTT_PAGE_SIZE;
> >   	unsigned int size = intel_rotation_info_size(rot_info);
> > -	struct sgt_iter sgt_iter;
> > -	dma_addr_t dma_addr;
> > -	unsigned long i;
> > -	dma_addr_t *page_addr_list;
> >   	struct sg_table *st;
> >   	struct scatterlist *sg;
> >   	int ret = -ENOMEM;
> > -
> > -	/* Allocate a temporary list of source pages for random access. */
> > -	page_addr_list = kvmalloc_array(n_pages,
> > -					sizeof(dma_addr_t),
> > -					GFP_KERNEL);
> > -	if (!page_addr_list)
> > -		return ERR_PTR(ret);
> > +	int i;
> >   
> >   	/* Allocate target SG list. */
> >   	st = kmalloc(sizeof(*st), GFP_KERNEL);
> > @@ -3694,29 +3684,20 @@ intel_rotate_pages(struct intel_rotation_info *rot_info,
> >   	if (ret)
> >   		goto err_sg_alloc;
> >   
> > -	/* Populate source page list from the object. */
> > -	i = 0;
> > -	for_each_sgt_dma(dma_addr, sgt_iter, obj->mm.pages)
> > -		page_addr_list[i++] = dma_addr;
> > -
> > -	GEM_BUG_ON(i != n_pages);
> >   	st->nents = 0;
> >   	sg = st->sgl;
> >   
> >   	for (i = 0 ; i < ARRAY_SIZE(rot_info->plane); i++) {
> > -		sg = rotate_pages(page_addr_list, rot_info->plane[i].offset,
> > +		sg = rotate_pages(obj, rot_info->plane[i].offset,
> >   				  rot_info->plane[i].width, rot_info->plane[i].height,
> >   				  rot_info->plane[i].stride, st, sg);
> >   	}
> >   
> > -	kvfree(page_addr_list);
> > -
> >   	return st;
> >   
> >   err_sg_alloc:
> >   	kfree(st);
> >   err_st_alloc:
> > -	kvfree(page_addr_list);
> >   
> >   	DRM_DEBUG_DRIVER("Failed to create rotated mapping for object size %zu! (%ux%u tiles, %u pages)\n",
> >   			 obj->base.size, rot_info->plane[0].width, rot_info->plane[0].height, size);
> > 

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
  2018-10-16 15:04 [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas Ville Syrjala
                   ` (2 preceding siblings ...)
  2018-10-16 17:54 ` [PATCH] " Tvrtko Ursulin
@ 2018-10-16 20:43 ` Patchwork
  2018-10-18 14:52 ` [PATCH] " Tvrtko Ursulin
  4 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2018-10-16 20:43 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
URL   : https://patchwork.freedesktop.org/series/51072/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4990_full -> Patchwork_10476_full =

== Summary - SUCCESS ==

  No regressions found.

  

== Known issues ==

  Here are the changes found in Patchwork_10476_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_cpu_reloc@full:
      shard-skl:          NOTRUN -> INCOMPLETE (fdo#108073)

    igt@gem_exec_schedule@pi-ringfull-bsd:
      shard-skl:          NOTRUN -> FAIL (fdo#103158) +2

    igt@gem_ppgtt@blt-vs-render-ctx0:
      shard-skl:          NOTRUN -> TIMEOUT (fdo#108039)

    igt@kms_busy@extended-modeset-hang-newfb-render-a:
      shard-skl:          NOTRUN -> DMESG-WARN (fdo#107956) +2

    igt@kms_busy@extended-modeset-hang-newfb-with-reset-render-a:
      shard-snb:          NOTRUN -> DMESG-WARN (fdo#107956)

    igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-a:
      shard-glk:          NOTRUN -> DMESG-WARN (fdo#107956) +1

    igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-b:
      shard-hsw:          NOTRUN -> DMESG-WARN (fdo#107956) +1

    igt@kms_cursor_crc@cursor-256x256-onscreen:
      shard-skl:          NOTRUN -> FAIL (fdo#103232)

    igt@kms_fbcon_fbt@psr:
      shard-skl:          NOTRUN -> FAIL (fdo#107882)

    igt@kms_frontbuffer_tracking@fbc-stridechange:
      shard-skl:          NOTRUN -> FAIL (fdo#105683)

    igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-pwrite:
      shard-skl:          NOTRUN -> FAIL (fdo#103167)

    igt@kms_panel_fitting@legacy:
      shard-skl:          NOTRUN -> FAIL (fdo#105456)

    igt@kms_plane@pixel-format-pipe-b-planes:
      shard-skl:          NOTRUN -> DMESG-FAIL (fdo#103166, fdo#106885) +1

    igt@kms_plane@plane-position-covered-pipe-b-planes:
      shard-glk:          NOTRUN -> FAIL (fdo#103166) +1

    igt@kms_plane_alpha_blend@pipe-a-alpha-basic:
      shard-skl:          NOTRUN -> FAIL (fdo#107815, fdo#108145) +1

    igt@kms_plane_alpha_blend@pipe-b-alpha-7efc:
      shard-skl:          NOTRUN -> FAIL (fdo#108146)

    igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max:
      shard-glk:          NOTRUN -> FAIL (fdo#108145) +1

    igt@kms_plane_alpha_blend@pipe-c-alpha-opaque-fb:
      shard-skl:          NOTRUN -> FAIL (fdo#108145) +1

    igt@kms_rotation_crc@exhaust-fences:
      shard-skl:          NOTRUN -> DMESG-WARN (fdo#105748)

    igt@kms_setmode@basic:
      shard-snb:          NOTRUN -> FAIL (fdo#99912)

    igt@kms_sysfs_edid_timing:
      shard-skl:          NOTRUN -> FAIL (fdo#100047)

    igt@perf@oa-exponents:
      shard-kbl:          PASS -> INCOMPLETE (fdo#103665)

    igt@pm_rpm@modeset-non-lpsp-stress:
      shard-skl:          SKIP -> INCOMPLETE (fdo#107807)

    
    ==== Possible fixes ====

    igt@gem_softpin@noreloc-s3:
      shard-skl:          INCOMPLETE (fdo#107773, fdo#104108) -> PASS

    igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-c:
      shard-skl:          DMESG-WARN (fdo#107956) -> PASS

    igt@pm_rpm@dpms-mode-unset-non-lpsp:
      shard-skl:          INCOMPLETE (fdo#107807) -> SKIP

    igt@pm_rpm@system-suspend-execbuf:
      shard-skl:          INCOMPLETE (fdo#107773, fdo#104108, fdo#107807) -> PASS

    
  fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047
  fdo#103158 https://bugs.freedesktop.org/show_bug.cgi?id=103158
  fdo#103166 https://bugs.freedesktop.org/show_bug.cgi?id=103166
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#103232 https://bugs.freedesktop.org/show_bug.cgi?id=103232
  fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
  fdo#104108 https://bugs.freedesktop.org/show_bug.cgi?id=104108
  fdo#105456 https://bugs.freedesktop.org/show_bug.cgi?id=105456
  fdo#105683 https://bugs.freedesktop.org/show_bug.cgi?id=105683
  fdo#105748 https://bugs.freedesktop.org/show_bug.cgi?id=105748
  fdo#106885 https://bugs.freedesktop.org/show_bug.cgi?id=106885
  fdo#107773 https://bugs.freedesktop.org/show_bug.cgi?id=107773
  fdo#107807 https://bugs.freedesktop.org/show_bug.cgi?id=107807
  fdo#107815 https://bugs.freedesktop.org/show_bug.cgi?id=107815
  fdo#107882 https://bugs.freedesktop.org/show_bug.cgi?id=107882
  fdo#107956 https://bugs.freedesktop.org/show_bug.cgi?id=107956
  fdo#108039 https://bugs.freedesktop.org/show_bug.cgi?id=108039
  fdo#108073 https://bugs.freedesktop.org/show_bug.cgi?id=108073
  fdo#108145 https://bugs.freedesktop.org/show_bug.cgi?id=108145
  fdo#108146 https://bugs.freedesktop.org/show_bug.cgi?id=108146
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912


== Participating hosts (6 -> 6) ==

  No changes in participating hosts


== Build changes ==

    * Linux: CI_DRM_4990 -> Patchwork_10476

  CI_DRM_4990: 0bd34d92e9a27784cb988a300619f497ca0e99ec @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4681: 959d6f95cb1344e0c0dace5b236e17755826fac1 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10476: 250009151f8d3d9243a6ba13953c1c86a3f38360 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10476/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
  2018-10-16 15:04 [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas Ville Syrjala
                   ` (3 preceding siblings ...)
  2018-10-16 20:43 ` ✓ Fi.CI.IGT: success for " Patchwork
@ 2018-10-18 14:52 ` Tvrtko Ursulin
  2018-10-18 17:02   ` Ville Syrjälä
  4 siblings, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2018-10-18 14:52 UTC (permalink / raw)
  To: Ville Syrjala, intel-gfx


On 16/10/2018 16:04, Ville Syrjala wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Replace the kvmalloc_array() with i915_gem_object_get_dma_address() when
> populating rotated vmas. One random access mechanism ought to be enough
> for everyone?
> 
> To calculate the size of the radix tree I think we can do
> something like this (assuming 64bit pointers):
>   num_pages = obj_size / 4096
>   tree_height = ceil(log64(num_pages))
>   num_nodes = sum(64^n, n, 0, tree_height-1)
>   tree_size = num_nodes * 576
> 
> If we compare that with the object size we should get a relative
> overhead of around .2% to 1% for reasonable sized objects,
> which framebuffers tend to be.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_gtt.c | 31 ++++++-----------------------
>   1 file changed, 6 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 29ca9007a704..98d9a1eb1ed2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -3637,7 +3637,7 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
>   }
>   
>   static struct scatterlist *
> -rotate_pages(const dma_addr_t *in, unsigned int offset,
> +rotate_pages(struct drm_i915_gem_object *obj, unsigned int offset,
>   	     unsigned int width, unsigned int height,
>   	     unsigned int stride,
>   	     struct sg_table *st, struct scatterlist *sg)
> @@ -3646,7 +3646,7 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
>   	unsigned int src_idx;
>   
>   	for (column = 0; column < width; column++) {
> -		src_idx = stride * (height - 1) + column;
> +		src_idx = stride * (height - 1) + column + offset;
>   		for (row = 0; row < height; row++) {
>   			st->nents++;
>   			/* We don't need the pages, but need to initialize
> @@ -3654,7 +3654,8 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
>   			 * The only thing we need are DMA addresses.
>   			 */
>   			sg_set_page(sg, NULL, I915_GTT_PAGE_SIZE, 0);
> -			sg_dma_address(sg) = in[offset + src_idx];
> +			sg_dma_address(sg) =
> +				i915_gem_object_get_dma_address(obj, src_idx);
>   			sg_dma_len(sg) = I915_GTT_PAGE_SIZE;
>   			sg = sg_next(sg);
>   			src_idx -= stride;
> @@ -3668,22 +3669,11 @@ static noinline struct sg_table *
>   intel_rotate_pages(struct intel_rotation_info *rot_info,
>   		   struct drm_i915_gem_object *obj)
>   {
> -	const unsigned long n_pages = obj->base.size / I915_GTT_PAGE_SIZE;
>   	unsigned int size = intel_rotation_info_size(rot_info);
> -	struct sgt_iter sgt_iter;
> -	dma_addr_t dma_addr;
> -	unsigned long i;
> -	dma_addr_t *page_addr_list;
>   	struct sg_table *st;
>   	struct scatterlist *sg;
>   	int ret = -ENOMEM;
> -
> -	/* Allocate a temporary list of source pages for random access. */
> -	page_addr_list = kvmalloc_array(n_pages,
> -					sizeof(dma_addr_t),
> -					GFP_KERNEL);
> -	if (!page_addr_list)
> -		return ERR_PTR(ret);
> +	int i;
>   
>   	/* Allocate target SG list. */
>   	st = kmalloc(sizeof(*st), GFP_KERNEL);
> @@ -3694,29 +3684,20 @@ intel_rotate_pages(struct intel_rotation_info *rot_info,
>   	if (ret)
>   		goto err_sg_alloc;
>   
> -	/* Populate source page list from the object. */
> -	i = 0;
> -	for_each_sgt_dma(dma_addr, sgt_iter, obj->mm.pages)
> -		page_addr_list[i++] = dma_addr;
> -
> -	GEM_BUG_ON(i != n_pages);
>   	st->nents = 0;
>   	sg = st->sgl;
>   
>   	for (i = 0 ; i < ARRAY_SIZE(rot_info->plane); i++) {
> -		sg = rotate_pages(page_addr_list, rot_info->plane[i].offset,
> +		sg = rotate_pages(obj, rot_info->plane[i].offset,
>   				  rot_info->plane[i].width, rot_info->plane[i].height,
>   				  rot_info->plane[i].stride, st, sg);
>   	}
>   
> -	kvfree(page_addr_list);
> -
>   	return st;
>   
>   err_sg_alloc:
>   	kfree(st);
>   err_st_alloc:
> -	kvfree(page_addr_list);
>   
>   	DRM_DEBUG_DRIVER("Failed to create rotated mapping for object size %zu! (%ux%u tiles, %u pages)\n",
>   			 obj->base.size, rot_info->plane[0].width, rot_info->plane[0].height, size);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

And apologies for the delay. Previously I was certain the cache size 
would typically be much bigger. Perhaps I measured incorrectly back 
then, or something...

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas
  2018-10-18 14:52 ` [PATCH] " Tvrtko Ursulin
@ 2018-10-18 17:02   ` Ville Syrjälä
  0 siblings, 0 replies; 8+ messages in thread
From: Ville Syrjälä @ 2018-10-18 17:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Thu, Oct 18, 2018 at 03:52:27PM +0100, Tvrtko Ursulin wrote:
> 
> On 16/10/2018 16:04, Ville Syrjala wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Replace the kvmalloc_array() with i915_gem_object_get_dma_address() when
> > populating rotated vmas. One random access mechanism ought to be enough
> > for everyone?
> > 
> > To calculate the size of the radix tree I think we can do
> > something like this (assuming 64bit pointers):
> >   num_pages = obj_size / 4096
> >   tree_height = ceil(log64(num_pages))
> >   num_nodes = sum(64^n, n, 0, tree_height-1)
> >   tree_size = num_nodes * 576
> > 
> > If we compare that with the object size we should get a relative
> > overhead of around .2% to 1% for reasonable sized objects,
> > which framebuffers tend to be.
> > 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem_gtt.c | 31 ++++++-----------------------
> >   1 file changed, 6 insertions(+), 25 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 29ca9007a704..98d9a1eb1ed2 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -3637,7 +3637,7 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
> >   }
> >   
> >   static struct scatterlist *
> > -rotate_pages(const dma_addr_t *in, unsigned int offset,
> > +rotate_pages(struct drm_i915_gem_object *obj, unsigned int offset,
> >   	     unsigned int width, unsigned int height,
> >   	     unsigned int stride,
> >   	     struct sg_table *st, struct scatterlist *sg)
> > @@ -3646,7 +3646,7 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
> >   	unsigned int src_idx;
> >   
> >   	for (column = 0; column < width; column++) {
> > -		src_idx = stride * (height - 1) + column;
> > +		src_idx = stride * (height - 1) + column + offset;
> >   		for (row = 0; row < height; row++) {
> >   			st->nents++;
> >   			/* We don't need the pages, but need to initialize
> > @@ -3654,7 +3654,8 @@ rotate_pages(const dma_addr_t *in, unsigned int offset,
> >   			 * The only thing we need are DMA addresses.
> >   			 */
> >   			sg_set_page(sg, NULL, I915_GTT_PAGE_SIZE, 0);
> > -			sg_dma_address(sg) = in[offset + src_idx];
> > +			sg_dma_address(sg) =
> > +				i915_gem_object_get_dma_address(obj, src_idx);
> >   			sg_dma_len(sg) = I915_GTT_PAGE_SIZE;
> >   			sg = sg_next(sg);
> >   			src_idx -= stride;
> > @@ -3668,22 +3669,11 @@ static noinline struct sg_table *
> >   intel_rotate_pages(struct intel_rotation_info *rot_info,
> >   		   struct drm_i915_gem_object *obj)
> >   {
> > -	const unsigned long n_pages = obj->base.size / I915_GTT_PAGE_SIZE;
> >   	unsigned int size = intel_rotation_info_size(rot_info);
> > -	struct sgt_iter sgt_iter;
> > -	dma_addr_t dma_addr;
> > -	unsigned long i;
> > -	dma_addr_t *page_addr_list;
> >   	struct sg_table *st;
> >   	struct scatterlist *sg;
> >   	int ret = -ENOMEM;
> > -
> > -	/* Allocate a temporary list of source pages for random access. */
> > -	page_addr_list = kvmalloc_array(n_pages,
> > -					sizeof(dma_addr_t),
> > -					GFP_KERNEL);
> > -	if (!page_addr_list)
> > -		return ERR_PTR(ret);
> > +	int i;
> >   
> >   	/* Allocate target SG list. */
> >   	st = kmalloc(sizeof(*st), GFP_KERNEL);
> > @@ -3694,29 +3684,20 @@ intel_rotate_pages(struct intel_rotation_info *rot_info,
> >   	if (ret)
> >   		goto err_sg_alloc;
> >   
> > -	/* Populate source page list from the object. */
> > -	i = 0;
> > -	for_each_sgt_dma(dma_addr, sgt_iter, obj->mm.pages)
> > -		page_addr_list[i++] = dma_addr;
> > -
> > -	GEM_BUG_ON(i != n_pages);
> >   	st->nents = 0;
> >   	sg = st->sgl;
> >   
> >   	for (i = 0 ; i < ARRAY_SIZE(rot_info->plane); i++) {
> > -		sg = rotate_pages(page_addr_list, rot_info->plane[i].offset,
> > +		sg = rotate_pages(obj, rot_info->plane[i].offset,
> >   				  rot_info->plane[i].width, rot_info->plane[i].height,
> >   				  rot_info->plane[i].stride, st, sg);
> >   	}
> >   
> > -	kvfree(page_addr_list);
> > -
> >   	return st;
> >   
> >   err_sg_alloc:
> >   	kfree(st);
> >   err_st_alloc:
> > -	kvfree(page_addr_list);
> >   
> >   	DRM_DEBUG_DRIVER("Failed to create rotated mapping for object size %zu! (%ux%u tiles, %u pages)\n",
> >   			 obj->base.size, rot_info->plane[0].width, rot_info->plane[0].height, size);
> > 
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> And apologies for the delay. Previously I was certain the cache size 
> would typically be much bigger. Perhaps I measured incorrectly back 
> then, or something...

No worries. Patch pushed to dinq. Thanks for the review.

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-10-18 17:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-16 15:04 [PATCH] drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas Ville Syrjala
2018-10-16 16:53 ` ✗ Fi.CI.SPARSE: warning for " Patchwork
2018-10-16 17:13 ` ✓ Fi.CI.BAT: success " Patchwork
2018-10-16 17:54 ` [PATCH] " Tvrtko Ursulin
2018-10-16 18:18   ` Ville Syrjälä
2018-10-16 20:43 ` ✓ Fi.CI.IGT: success for " Patchwork
2018-10-18 14:52 ` [PATCH] " Tvrtko Ursulin
2018-10-18 17:02   ` Ville Syrjälä

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.