All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Wa32bit & Enable 48bit PPGTT
@ 2015-09-30 14:36 Michel Thierry
  2015-09-30 14:36 ` [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
  2015-09-30 14:36 ` [PATCH 2/2] drm/i915/gen8: Flip the 48b switch Michel Thierry
  0 siblings, 2 replies; 13+ messages in thread
From: Michel Thierry @ 2015-09-30 14:36 UTC (permalink / raw)
  To: intel-gfx

I am resending the 2 remaining patches to enable 48-bit PPGTT, now that the
userland usage has been defined and acknowledged.
These patches are exactly the same that were sent with the rest of the 48-bit
PPGTT implementation that are already merged.

There are 2 hardware workarounds needed to allow correct operation with
48b addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset).
A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can
be allocated outside the first 4 PDPs; if not, the end range is forced to 4GB.
Also, more objects now use the DRM_MM_CREATE_TOP flag. To maintain
compatibility, in libdrm I added a new bo_use_48b_address_range function
that will flag these objects, while the existing drm_intel_bo_emit_reloc
clears it. 

Userland patch:
http://lists.freedesktop.org/archives/intel-gfx/2015-September/075086.html
acknowledged:
http://lists.freedesktop.org/archives/intel-gfx/2015-September/075836.html

Michel Thierry (2):
  drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  drm/i915/gen8: Flip the 48b switch

 drivers/gpu/drm/i915/i915_drv.h            |  2 ++
 drivers/gpu/drm/i915/i915_gem.c            | 25 +++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  7 ++++++-
 drivers/gpu/drm/i915/i915_params.c         |  2 +-
 include/uapi/drm/i915_drm.h                |  3 ++-
 6 files changed, 43 insertions(+), 9 deletions(-)

-- 
2.6.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-09-30 14:36 [PATCH 0/2] Wa32bit & Enable 48bit PPGTT Michel Thierry
@ 2015-09-30 14:36 ` Michel Thierry
  2015-09-30 14:42   ` Chris Wilson
                     ` (2 more replies)
  2015-09-30 14:36 ` [PATCH 2/2] drm/i915/gen8: Flip the 48b switch Michel Thierry
  1 sibling, 3 replies; 13+ messages in thread
From: Michel Thierry @ 2015-09-30 14:36 UTC (permalink / raw)
  To: intel-gfx

There are some allocations that must be only referenced by 32-bit
offsets. To limit the chances of having the first 4GB already full,
objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
DRM_MM_CREATE_TOP flags

In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
General State Heap (GSH) or Instruction State Heap (ISH) must be in a
32-bit range, because the General State Offset and Instruction State
Offset are limited to 32-bits.

Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
they can be allocated above the 32-bit address range. To limit the
chances of having the first 4GB already full, objects will use
DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.

The libdrm user of the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag is here:
http://lists.freedesktop.org/archives/intel-gfx/2015-September/075836.html

v2: Changed flag logic from neeeds_32b, to supports_48b.
v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
to use last PIN_ defined instead of hard-coded value; use correct limit
check in eb_vma_misplaced. (Chris)
v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
v6: Apply pin-high for ggtt too (Chris)
v7: Handle simultaneous pin-high and pin-mappable end correctly (Akash)
    Fix check for entries currently using +4GB addresses, use min_t and
    other polish in object_bind_to_vm (Chris)
v8: Commit message updated to point to libdrm patch.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  2 ++
 drivers/gpu/drm/i915/i915_gem.c            | 25 +++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
 include/uapi/drm/i915_drm.h                |  3 ++-
 4 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 04d710f..ecc56fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2826,6 +2826,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
 #define PIN_OFFSET_BIAS	(1<<3)
 #define PIN_USER	(1<<4)
 #define PIN_UPDATE	(1<<5)
+#define PIN_ZONE_4G	(1<<6)
+#define PIN_HIGH	(1<<7)
 #define PIN_OFFSET_MASK (~4095)
 int __must_check
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bf5ef7a..f0cfbb9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3354,11 +3354,9 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 fence_alignment, unfenced_alignment;
+	u32 search_flag, alloc_flag;
+	u64 start, end;
 	u64 size, fence_size;
-	u64 start =
-		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
-	u64 end =
-		flags & PIN_MAPPABLE ? dev_priv->gtt.mappable_end : vm->total;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3398,6 +3396,13 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 		size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
 	}
 
+	start = flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
+	end = vm->total;
+	if (flags & PIN_MAPPABLE)
+		end = min_t(u64, end, dev_priv->gtt.mappable_end);
+	if (flags & PIN_ZONE_4G)
+		end = min_t(u64, end, (1ULL << 32));
+
 	if (alignment == 0)
 		alignment = flags & PIN_MAPPABLE ? fence_alignment :
 						unfenced_alignment;
@@ -3433,13 +3438,21 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	if (IS_ERR(vma))
 		goto err_unpin;
 
+	if (flags & PIN_HIGH) {
+		search_flag = DRM_MM_SEARCH_BELOW;
+		alloc_flag = DRM_MM_CREATE_TOP;
+	} else {
+		search_flag = DRM_MM_SEARCH_DEFAULT;
+		alloc_flag = DRM_MM_CREATE_DEFAULT;
+	}
+
 search_free:
 	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
 						  size, alignment,
 						  obj->cache_level,
 						  start, end,
-						  DRM_MM_SEARCH_DEFAULT,
-						  DRM_MM_CREATE_DEFAULT);
+						  search_flag,
+						  alloc_flag);
 	if (ret) {
 		ret = i915_gem_evict_something(dev, vm, size, alignment,
 					       obj->cache_level,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 67ef118..6ca39c1 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -589,11 +589,20 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 	if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
 		flags |= PIN_GLOBAL;
 
+	/* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
+	 * limit address to the first 4GBs for unflagged objects.
+	 */
+	flags |= PIN_ZONE_4G;
+	if (entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)
+		flags &= ~PIN_ZONE_4G;
+
 	if (!drm_mm_node_allocated(&vma->node)) {
 		if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
 			flags |= PIN_GLOBAL | PIN_MAPPABLE;
 		if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS)
 			flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
+		if ((flags & PIN_MAPPABLE) == 0)
+			flags |= PIN_HIGH;
 	}
 
 	ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags);
@@ -671,6 +680,10 @@ eb_vma_misplaced(struct i915_vma *vma)
 	if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
 		return !only_mappable_for_reloc(entry->flags);
 
+	if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
+	    (vma->node.start + vma->node.size - 1) >> 32)
+		return true;
+
 	return false;
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index fd5aa47..484a9fb 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -690,7 +690,8 @@ struct drm_i915_gem_exec_object2 {
 #define EXEC_OBJECT_NEEDS_FENCE (1<<0)
 #define EXEC_OBJECT_NEEDS_GTT	(1<<1)
 #define EXEC_OBJECT_WRITE	(1<<2)
-#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1)
+#define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
+#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_SUPPORTS_48B_ADDRESS<<1)
 	__u64 flags;
 
 	__u64 rsvd1;
-- 
2.6.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] drm/i915/gen8: Flip the 48b switch
  2015-09-30 14:36 [PATCH 0/2] Wa32bit & Enable 48bit PPGTT Michel Thierry
  2015-09-30 14:36 ` [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
@ 2015-09-30 14:36 ` Michel Thierry
  2015-10-01 13:16   ` Daniel Vetter
  1 sibling, 1 reply; 13+ messages in thread
From: Michel Thierry @ 2015-09-30 14:36 UTC (permalink / raw)
  To: intel-gfx

Use 48b addresses if hw supports it (i915.enable_ppgtt=3).
Update the sanitize_enable_ppgtt for 48 bit PPGTT mode.

Note, aliasing PPGTT remains 32b only.

v2: s/full_64b/full_48b/. (Akash)
v3: Add sanitize_enable_ppgtt changes until here. (Akash)
v4: Update param description (Chris)

Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 7 ++++++-
 drivers/gpu/drm/i915/i915_params.c  | 2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 47344d0..2d964a7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -104,9 +104,11 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 {
 	bool has_aliasing_ppgtt;
 	bool has_full_ppgtt;
+	bool has_full_48bit_ppgtt;
 
 	has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
 	has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
+	has_full_48bit_ppgtt = IS_BROADWELL(dev) || INTEL_INFO(dev)->gen >= 9;
 
 	if (intel_vgpu_active(dev))
 		has_full_ppgtt = false; /* emulation is too hard */
@@ -125,6 +127,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 	if (enable_ppgtt == 2 && has_full_ppgtt)
 		return 2;
 
+	if (enable_ppgtt == 3 && has_full_48bit_ppgtt)
+		return 3;
+
 #ifdef CONFIG_INTEL_IOMMU
 	/* Disable ppgtt on SNB if VT-d is on. */
 	if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
@@ -141,7 +146,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 	}
 
 	if (INTEL_INFO(dev)->gen >= 8 && i915.enable_execlists)
-		return 2;
+		return has_full_48bit_ppgtt ? 3 : 2;
 	else
 		return has_aliasing_ppgtt ? 1 : 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index ca9b8f6..368df67 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -111,7 +111,7 @@ MODULE_PARM_DESC(enable_hangcheck,
 module_param_named_unsafe(enable_ppgtt, i915.enable_ppgtt, int, 0400);
 MODULE_PARM_DESC(enable_ppgtt,
 	"Override PPGTT usage. "
-	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full)");
+	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full, 3=full with extended address space)");
 
 module_param_named_unsafe(enable_execlists, i915.enable_execlists, int, 0400);
 MODULE_PARM_DESC(enable_execlists,
-- 
2.6.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-09-30 14:36 ` [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
@ 2015-09-30 14:42   ` Chris Wilson
  2015-09-30 15:45   ` Tvrtko Ursulin
  2015-10-01 12:33   ` [PATCH v2 (not really v2)] " Michel Thierry
  2 siblings, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2015-09-30 14:42 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Wed, Sep 30, 2015 at 03:36:18PM +0100, Michel Thierry wrote:

I made one more change based on profiling:

> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 67ef118..6ca39c1 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -589,11 +589,20 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
>  	if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
>  		flags |= PIN_GLOBAL;
>  
> +	/* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
> +	 * limit address to the first 4GBs for unflagged objects.
> +	 */
> +	flags |= PIN_ZONE_4G;
> +	if (entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)
> +		flags &= ~PIN_ZONE_4G;
> +
>  	if (!drm_mm_node_allocated(&vma->node)) {
		/* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
		 * limit address to the first 4GBs for unflagged objects.
		 */
		if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0)
			flags |= PIN_ZONE_4G;

>  		if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
>  			flags |= PIN_GLOBAL | PIN_MAPPABLE;
>  		if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS)
>  			flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
> +		if ((flags & PIN_MAPPABLE) == 0)
> +			flags |= PIN_HIGH;
>  	}

It saves me a patch if it we do it now.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-09-30 14:36 ` [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
  2015-09-30 14:42   ` Chris Wilson
@ 2015-09-30 15:45   ` Tvrtko Ursulin
  2015-09-30 15:53     ` Chris Wilson
  2015-10-01 12:33   ` [PATCH v2 (not really v2)] " Michel Thierry
  2 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2015-09-30 15:45 UTC (permalink / raw)
  To: Michel Thierry, intel-gfx


Hi,

On 30/09/15 15:36, Michel Thierry wrote:
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 67ef118..6ca39c1 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -589,11 +589,20 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
>   	if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
>   		flags |= PIN_GLOBAL;
>
> +	/* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
> +	 * limit address to the first 4GBs for unflagged objects.
> +	 */
> +	flags |= PIN_ZONE_4G;
> +	if (entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)
> +		flags &= ~PIN_ZONE_4G;

I spotted this patch purely accidentally since it was probably the 
reason pad_to_size IGT started failing in the Android tree - given how 
there is mention of changing the allocation order. Anyway beside the point..

Point is when I spotted it by accident, I also spotted this unusual 
handling of flags - set then conditionally clear. Why not conditionally 
set for one fewer line of code?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-09-30 15:45   ` Tvrtko Ursulin
@ 2015-09-30 15:53     ` Chris Wilson
  0 siblings, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2015-09-30 15:53 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Wed, Sep 30, 2015 at 04:45:18PM +0100, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> On 30/09/15 15:36, Michel Thierry wrote:
> >diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >index 67ef118..6ca39c1 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >@@ -589,11 +589,20 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> >  	if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
> >  		flags |= PIN_GLOBAL;
> >
> >+	/* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
> >+	 * limit address to the first 4GBs for unflagged objects.
> >+	 */
> >+	flags |= PIN_ZONE_4G;
> >+	if (entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)
> >+		flags &= ~PIN_ZONE_4G;
> 
> I spotted this patch purely accidentally since it was probably the
> reason pad_to_size IGT started failing in the Android tree - given
> how there is mention of changing the allocation order. Anyway beside
> the point..
> 
> Point is when I spotted it by accident, I also spotted this unusual
> handling of flags - set then conditionally clear. Why not
> conditionally set for one fewer line of code?

Because for the next few years entry->flags & 48B will be in the
minority. The other reason is that is emphasizes that we limit
everything by default and only allow the special objects to use highmem.

The idiom of:
  x = default/expected;
  if (unlikely)
  	x = something else;
is fairly commonplace in the kernel, at least before gcc had likely().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 (not really v2)] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-09-30 14:36 ` [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
  2015-09-30 14:42   ` Chris Wilson
  2015-09-30 15:45   ` Tvrtko Ursulin
@ 2015-10-01 12:33   ` Michel Thierry
  2015-10-01 13:15     ` Daniel Vetter
  2 siblings, 1 reply; 13+ messages in thread
From: Michel Thierry @ 2015-10-01 12:33 UTC (permalink / raw)
  To: intel-gfx

There are some allocations that must be only referenced by 32-bit
offsets. To limit the chances of having the first 4GB already full,
objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
DRM_MM_CREATE_TOP flags

In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
General State Heap (GSH) or Instruction State Heap (ISH) must be in a
32-bit range, because the General State Offset and Instruction State
Offset are limited to 32-bits.

Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
they can be allocated above the 32-bit address range. To limit the
chances of having the first 4GB already full, objects will use
DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.

The libdrm user of the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag is here:
http://lists.freedesktop.org/archives/intel-gfx/2015-September/075836.html

v2: Changed flag logic from neeeds_32b, to supports_48b.
v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
to use last PIN_ defined instead of hard-coded value; use correct limit
check in eb_vma_misplaced. (Chris)
v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
v6: Apply pin-high for ggtt too (Chris)
v7: Handle simultaneous pin-high and pin-mappable end correctly (Akash)
    Fix check for entries currently using +4GB addresses, use min_t and
    other polish in object_bind_to_vm (Chris)
v8: Commit message updated to point to libdrm patch.
v9: vmas are allocated in the correct ozone, so only check flag when the
    vma has not been allocated. (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  2 ++
 drivers/gpu/drm/i915/i915_gem.c            | 25 +++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++++++++
 include/uapi/drm/i915_drm.h                |  3 ++-
 4 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 04d710f..ecc56fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2826,6 +2826,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
 #define PIN_OFFSET_BIAS	(1<<3)
 #define PIN_USER	(1<<4)
 #define PIN_UPDATE	(1<<5)
+#define PIN_ZONE_4G	(1<<6)
+#define PIN_HIGH	(1<<7)
 #define PIN_OFFSET_MASK (~4095)
 int __must_check
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bf5ef7a..f0cfbb9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3354,11 +3354,9 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 fence_alignment, unfenced_alignment;
+	u32 search_flag, alloc_flag;
+	u64 start, end;
 	u64 size, fence_size;
-	u64 start =
-		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
-	u64 end =
-		flags & PIN_MAPPABLE ? dev_priv->gtt.mappable_end : vm->total;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3398,6 +3396,13 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 		size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
 	}
 
+	start = flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
+	end = vm->total;
+	if (flags & PIN_MAPPABLE)
+		end = min_t(u64, end, dev_priv->gtt.mappable_end);
+	if (flags & PIN_ZONE_4G)
+		end = min_t(u64, end, (1ULL << 32));
+
 	if (alignment == 0)
 		alignment = flags & PIN_MAPPABLE ? fence_alignment :
 						unfenced_alignment;
@@ -3433,13 +3438,21 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	if (IS_ERR(vma))
 		goto err_unpin;
 
+	if (flags & PIN_HIGH) {
+		search_flag = DRM_MM_SEARCH_BELOW;
+		alloc_flag = DRM_MM_CREATE_TOP;
+	} else {
+		search_flag = DRM_MM_SEARCH_DEFAULT;
+		alloc_flag = DRM_MM_CREATE_DEFAULT;
+	}
+
 search_free:
 	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
 						  size, alignment,
 						  obj->cache_level,
 						  start, end,
-						  DRM_MM_SEARCH_DEFAULT,
-						  DRM_MM_CREATE_DEFAULT);
+						  search_flag,
+						  alloc_flag);
 	if (ret) {
 		ret = i915_gem_evict_something(dev, vm, size, alignment,
 					       obj->cache_level,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 67ef118..edc17be 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -590,10 +590,17 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		flags |= PIN_GLOBAL;
 
 	if (!drm_mm_node_allocated(&vma->node)) {
+		/* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
+		 * limit address to the first 4GBs for unflagged objects.
+		 */
+		if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0)
+			flags |= PIN_ZONE_4G;
 		if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
 			flags |= PIN_GLOBAL | PIN_MAPPABLE;
 		if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS)
 			flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
+		if ((flags & PIN_MAPPABLE) == 0)
+			flags |= PIN_HIGH;
 	}
 
 	ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags);
@@ -671,6 +678,10 @@ eb_vma_misplaced(struct i915_vma *vma)
 	if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
 		return !only_mappable_for_reloc(entry->flags);
 
+	if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
+	    (vma->node.start + vma->node.size - 1) >> 32)
+		return true;
+
 	return false;
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index fd5aa47..484a9fb 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -690,7 +690,8 @@ struct drm_i915_gem_exec_object2 {
 #define EXEC_OBJECT_NEEDS_FENCE (1<<0)
 #define EXEC_OBJECT_NEEDS_GTT	(1<<1)
 #define EXEC_OBJECT_WRITE	(1<<2)
-#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1)
+#define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
+#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_SUPPORTS_48B_ADDRESS<<1)
 	__u64 flags;
 
 	__u64 rsvd1;
-- 
2.6.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 (not really v2)] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-10-01 12:33   ` [PATCH v2 (not really v2)] " Michel Thierry
@ 2015-10-01 13:15     ` Daniel Vetter
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Vetter @ 2015-10-01 13:15 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Thu, Oct 01, 2015 at 01:33:57PM +0100, Michel Thierry wrote:
> There are some allocations that must be only referenced by 32-bit
> offsets. To limit the chances of having the first 4GB already full,
> objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
> DRM_MM_CREATE_TOP flags
> 
> In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
> General State Heap (GSH) or Instruction State Heap (ISH) must be in a
> 32-bit range, because the General State Offset and Instruction State
> Offset are limited to 32-bits.
> 
> Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
> they can be allocated above the 32-bit address range. To limit the
> chances of having the first 4GB already full, objects will use
> DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.
> 
> The libdrm user of the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag is here:
> http://lists.freedesktop.org/archives/intel-gfx/2015-September/075836.html
> 
> v2: Changed flag logic from neeeds_32b, to supports_48b.
> v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
> v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
> to use last PIN_ defined instead of hard-coded value; use correct limit
> check in eb_vma_misplaced. (Chris)
> v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
> v6: Apply pin-high for ggtt too (Chris)
> v7: Handle simultaneous pin-high and pin-mappable end correctly (Akash)
>     Fix check for entries currently using +4GB addresses, use min_t and
>     other polish in object_bind_to_vm (Chris)
> v8: Commit message updated to point to libdrm patch.
> v9: vmas are allocated in the correct ozone, so only check flag when the
>     vma has not been allocated. (Chris)
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>

Queued for -next, thanks for the patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  2 ++
>  drivers/gpu/drm/i915/i915_gem.c            | 25 +++++++++++++++++++------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++++++++
>  include/uapi/drm/i915_drm.h                |  3 ++-
>  4 files changed, 34 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 04d710f..ecc56fa 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2826,6 +2826,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
>  #define PIN_OFFSET_BIAS	(1<<3)
>  #define PIN_USER	(1<<4)
>  #define PIN_UPDATE	(1<<5)
> +#define PIN_ZONE_4G	(1<<6)
> +#define PIN_HIGH	(1<<7)
>  #define PIN_OFFSET_MASK (~4095)
>  int __must_check
>  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bf5ef7a..f0cfbb9 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3354,11 +3354,9 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	u32 fence_alignment, unfenced_alignment;
> +	u32 search_flag, alloc_flag;
> +	u64 start, end;
>  	u64 size, fence_size;
> -	u64 start =
> -		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
> -	u64 end =
> -		flags & PIN_MAPPABLE ? dev_priv->gtt.mappable_end : vm->total;
>  	struct i915_vma *vma;
>  	int ret;
>  
> @@ -3398,6 +3396,13 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  		size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
>  	}
>  
> +	start = flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
> +	end = vm->total;
> +	if (flags & PIN_MAPPABLE)
> +		end = min_t(u64, end, dev_priv->gtt.mappable_end);
> +	if (flags & PIN_ZONE_4G)
> +		end = min_t(u64, end, (1ULL << 32));
> +
>  	if (alignment == 0)
>  		alignment = flags & PIN_MAPPABLE ? fence_alignment :
>  						unfenced_alignment;
> @@ -3433,13 +3438,21 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  	if (IS_ERR(vma))
>  		goto err_unpin;
>  
> +	if (flags & PIN_HIGH) {
> +		search_flag = DRM_MM_SEARCH_BELOW;
> +		alloc_flag = DRM_MM_CREATE_TOP;
> +	} else {
> +		search_flag = DRM_MM_SEARCH_DEFAULT;
> +		alloc_flag = DRM_MM_CREATE_DEFAULT;
> +	}
> +
>  search_free:
>  	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>  						  size, alignment,
>  						  obj->cache_level,
>  						  start, end,
> -						  DRM_MM_SEARCH_DEFAULT,
> -						  DRM_MM_CREATE_DEFAULT);
> +						  search_flag,
> +						  alloc_flag);
>  	if (ret) {
>  		ret = i915_gem_evict_something(dev, vm, size, alignment,
>  					       obj->cache_level,
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 67ef118..edc17be 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -590,10 +590,17 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
>  		flags |= PIN_GLOBAL;
>  
>  	if (!drm_mm_node_allocated(&vma->node)) {
> +		/* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
> +		 * limit address to the first 4GBs for unflagged objects.
> +		 */
> +		if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0)
> +			flags |= PIN_ZONE_4G;
>  		if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
>  			flags |= PIN_GLOBAL | PIN_MAPPABLE;
>  		if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS)
>  			flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
> +		if ((flags & PIN_MAPPABLE) == 0)
> +			flags |= PIN_HIGH;
>  	}
>  
>  	ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags);
> @@ -671,6 +678,10 @@ eb_vma_misplaced(struct i915_vma *vma)
>  	if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
>  		return !only_mappable_for_reloc(entry->flags);
>  
> +	if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
> +	    (vma->node.start + vma->node.size - 1) >> 32)
> +		return true;
> +
>  	return false;
>  }
>  
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index fd5aa47..484a9fb 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -690,7 +690,8 @@ struct drm_i915_gem_exec_object2 {
>  #define EXEC_OBJECT_NEEDS_FENCE (1<<0)
>  #define EXEC_OBJECT_NEEDS_GTT	(1<<1)
>  #define EXEC_OBJECT_WRITE	(1<<2)
> -#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1)
> +#define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
> +#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_SUPPORTS_48B_ADDRESS<<1)
>  	__u64 flags;
>  
>  	__u64 rsvd1;
> -- 
> 2.6.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915/gen8: Flip the 48b switch
  2015-09-30 14:36 ` [PATCH 2/2] drm/i915/gen8: Flip the 48b switch Michel Thierry
@ 2015-10-01 13:16   ` Daniel Vetter
  2015-10-01 14:40     ` Michel Thierry
  2015-10-16 12:23     ` Michel Thierry
  0 siblings, 2 replies; 13+ messages in thread
From: Daniel Vetter @ 2015-10-01 13:16 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Wed, Sep 30, 2015 at 03:36:19PM +0100, Michel Thierry wrote:
> Use 48b addresses if hw supports it (i915.enable_ppgtt=3).
> Update the sanitize_enable_ppgtt for 48 bit PPGTT mode.
> 
> Note, aliasing PPGTT remains 32b only.
> 
> v2: s/full_64b/full_48b/. (Akash)
> v3: Add sanitize_enable_ppgtt changes until here. (Akash)
> v4: Update param description (Chris)
> 
> Cc: Akash Goel <akash.goel@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>

Queued for -next, thanks for the patch. Can you please ping someone from
mesa to push the libdrm/mesa patches too?

Cheers, Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 7 ++++++-
>  drivers/gpu/drm/i915/i915_params.c  | 2 +-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 47344d0..2d964a7 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -104,9 +104,11 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>  {
>  	bool has_aliasing_ppgtt;
>  	bool has_full_ppgtt;
> +	bool has_full_48bit_ppgtt;
>  
>  	has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
>  	has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
> +	has_full_48bit_ppgtt = IS_BROADWELL(dev) || INTEL_INFO(dev)->gen >= 9;
>  
>  	if (intel_vgpu_active(dev))
>  		has_full_ppgtt = false; /* emulation is too hard */
> @@ -125,6 +127,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>  	if (enable_ppgtt == 2 && has_full_ppgtt)
>  		return 2;
>  
> +	if (enable_ppgtt == 3 && has_full_48bit_ppgtt)
> +		return 3;
> +
>  #ifdef CONFIG_INTEL_IOMMU
>  	/* Disable ppgtt on SNB if VT-d is on. */
>  	if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
> @@ -141,7 +146,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>  	}
>  
>  	if (INTEL_INFO(dev)->gen >= 8 && i915.enable_execlists)
> -		return 2;
> +		return has_full_48bit_ppgtt ? 3 : 2;
>  	else
>  		return has_aliasing_ppgtt ? 1 : 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index ca9b8f6..368df67 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -111,7 +111,7 @@ MODULE_PARM_DESC(enable_hangcheck,
>  module_param_named_unsafe(enable_ppgtt, i915.enable_ppgtt, int, 0400);
>  MODULE_PARM_DESC(enable_ppgtt,
>  	"Override PPGTT usage. "
> -	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full)");
> +	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full, 3=full with extended address space)");
>  
>  module_param_named_unsafe(enable_execlists, i915.enable_execlists, int, 0400);
>  MODULE_PARM_DESC(enable_execlists,
> -- 
> 2.6.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915/gen8: Flip the 48b switch
  2015-10-01 13:16   ` Daniel Vetter
@ 2015-10-01 14:40     ` Michel Thierry
  2015-10-16 12:23     ` Michel Thierry
  1 sibling, 0 replies; 13+ messages in thread
From: Michel Thierry @ 2015-10-01 14:40 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On 10/1/2015 2:16 PM, Daniel Vetter wrote:
> On Wed, Sep 30, 2015 at 03:36:19PM +0100, Michel Thierry wrote:
>> Use 48b addresses if hw supports it (i915.enable_ppgtt=3).
>> Update the sanitize_enable_ppgtt for 48 bit PPGTT mode.
>>
>> Note, aliasing PPGTT remains 32b only.
>>
>> v2: s/full_64b/full_48b/. (Akash)
>> v3: Add sanitize_enable_ppgtt changes until here. (Akash)
>> v4: Update param description (Chris)
>>
>> Cc: Akash Goel <akash.goel@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>
> Queued for -next, thanks for the patch. Can you please ping someone from
> mesa to push the libdrm/mesa patches too?
>

Sure, I'll start the libdmr-mesa process.

Thanks,
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915/gen8: Flip the 48b switch
  2015-10-01 13:16   ` Daniel Vetter
  2015-10-01 14:40     ` Michel Thierry
@ 2015-10-16 12:23     ` Michel Thierry
  2015-10-19  9:44       ` Daniel Vetter
  1 sibling, 1 reply; 13+ messages in thread
From: Michel Thierry @ 2015-10-16 12:23 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On 10/1/2015 2:16 PM, Daniel Vetter wrote:
> On Wed, Sep 30, 2015 at 03:36:19PM +0100, Michel Thierry wrote:
>> Use 48b addresses if hw supports it (i915.enable_ppgtt=3).
>> Update the sanitize_enable_ppgtt for 48 bit PPGTT mode.
>>
>> Note, aliasing PPGTT remains 32b only.
>>
>> v2: s/full_64b/full_48b/. (Akash)
>> v3: Add sanitize_enable_ppgtt changes until here. (Akash)
>> v4: Update param description (Chris)
>>
>> Cc: Akash Goel <akash.goel@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>
> Queued for -next, thanks for the patch. Can you please ping someone from
> mesa to push the libdrm/mesa patches too?
>
> Cheers, Daniel
>

Sorry I didn't report this earlier, but I just realized I can't find 
this patch in the tree.

There's still the discussion of where to apply the 32-bit workaround and 
the different behavior seen in mesa (not if it isn't needed)... but the 
code was written to not require that flag in order to start using the 
4-level page translation.

Thanks,

-Michel

>> ---
>>   drivers/gpu/drm/i915/i915_gem_gtt.c | 7 ++++++-
>>   drivers/gpu/drm/i915/i915_params.c  | 2 +-
>>   2 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index 47344d0..2d964a7 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -104,9 +104,11 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>>   {
>>   	bool has_aliasing_ppgtt;
>>   	bool has_full_ppgtt;
>> +	bool has_full_48bit_ppgtt;
>>
>>   	has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
>>   	has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
>> +	has_full_48bit_ppgtt = IS_BROADWELL(dev) || INTEL_INFO(dev)->gen >= 9;
>>
>>   	if (intel_vgpu_active(dev))
>>   		has_full_ppgtt = false; /* emulation is too hard */
>> @@ -125,6 +127,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>>   	if (enable_ppgtt == 2 && has_full_ppgtt)
>>   		return 2;
>>
>> +	if (enable_ppgtt == 3 && has_full_48bit_ppgtt)
>> +		return 3;
>> +
>>   #ifdef CONFIG_INTEL_IOMMU
>>   	/* Disable ppgtt on SNB if VT-d is on. */
>>   	if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
>> @@ -141,7 +146,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>>   	}
>>
>>   	if (INTEL_INFO(dev)->gen >= 8 && i915.enable_execlists)
>> -		return 2;
>> +		return has_full_48bit_ppgtt ? 3 : 2;
>>   	else
>>   		return has_aliasing_ppgtt ? 1 : 0;
>>   }
>> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
>> index ca9b8f6..368df67 100644
>> --- a/drivers/gpu/drm/i915/i915_params.c
>> +++ b/drivers/gpu/drm/i915/i915_params.c
>> @@ -111,7 +111,7 @@ MODULE_PARM_DESC(enable_hangcheck,
>>   module_param_named_unsafe(enable_ppgtt, i915.enable_ppgtt, int, 0400);
>>   MODULE_PARM_DESC(enable_ppgtt,
>>   	"Override PPGTT usage. "
>> -	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full)");
>> +	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full, 3=full with extended address space)");
>>
>>   module_param_named_unsafe(enable_execlists, i915.enable_execlists, int, 0400);
>>   MODULE_PARM_DESC(enable_execlists,
>> --
>> 2.6.0
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915/gen8: Flip the 48b switch
  2015-10-16 12:23     ` Michel Thierry
@ 2015-10-19  9:44       ` Daniel Vetter
  2015-10-19 10:01         ` Chris Wilson
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Vetter @ 2015-10-19  9:44 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Fri, Oct 16, 2015 at 01:23:41PM +0100, Michel Thierry wrote:
> On 10/1/2015 2:16 PM, Daniel Vetter wrote:
> >On Wed, Sep 30, 2015 at 03:36:19PM +0100, Michel Thierry wrote:
> >>Use 48b addresses if hw supports it (i915.enable_ppgtt=3).
> >>Update the sanitize_enable_ppgtt for 48 bit PPGTT mode.
> >>
> >>Note, aliasing PPGTT remains 32b only.
> >>
> >>v2: s/full_64b/full_48b/. (Akash)
> >>v3: Add sanitize_enable_ppgtt changes until here. (Akash)
> >>v4: Update param description (Chris)
> >>
> >>Cc: Akash Goel <akash.goel@intel.com>
> >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> >
> >Queued for -next, thanks for the patch. Can you please ping someone from
> >mesa to push the libdrm/mesa patches too?
> >
> >Cheers, Daniel
> >
> 
> Sorry I didn't report this earlier, but I just realized I can't find this
> patch in the tree.

Indeed I seem to have never pushed it. Sorry.

> There's still the discussion of where to apply the 32-bit workaround and the
> different behavior seen in mesa (not if it isn't needed)... but the code was
> written to not require that flag in order to start using the 4-level page
> translation.

Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] drm/i915/gen8: Flip the 48b switch
  2015-10-19  9:44       ` Daniel Vetter
@ 2015-10-19 10:01         ` Chris Wilson
  0 siblings, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2015-10-19 10:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, Oct 19, 2015 at 11:44:12AM +0200, Daniel Vetter wrote:
> On Fri, Oct 16, 2015 at 01:23:41PM +0100, Michel Thierry wrote:
> > On 10/1/2015 2:16 PM, Daniel Vetter wrote:
> > >On Wed, Sep 30, 2015 at 03:36:19PM +0100, Michel Thierry wrote:
> > >>Use 48b addresses if hw supports it (i915.enable_ppgtt=3).
> > >>Update the sanitize_enable_ppgtt for 48 bit PPGTT mode.
> > >>
> > >>Note, aliasing PPGTT remains 32b only.
> > >>
> > >>v2: s/full_64b/full_48b/. (Akash)
> > >>v3: Add sanitize_enable_ppgtt changes until here. (Akash)
> > >>v4: Update param description (Chris)
> > >>
> > >>Cc: Akash Goel <akash.goel@intel.com>
> > >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > >>Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> > >
> > >Queued for -next, thanks for the patch. Can you please ping someone from
> > >mesa to push the libdrm/mesa patches too?
> > >
> > >Cheers, Daniel
> > >
> > 
> > Sorry I didn't report this earlier, but I just realized I can't find this
> > patch in the tree.
> 
> Indeed I seem to have never pushed it. Sorry.
> 
> > There's still the discussion of where to apply the 32-bit workaround and the
> > different behavior seen in mesa (not if it isn't needed)... but the code was
> > written to not require that flag in order to start using the 4-level page
> > translation.
> 
> Queued for -next, thanks for the patch.

We never got an answer whether there was any impact on existing
userspace which is constained to 4GiB (where the last level of
pagetables are superfluous).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-10-19 10:01 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-30 14:36 [PATCH 0/2] Wa32bit & Enable 48bit PPGTT Michel Thierry
2015-09-30 14:36 ` [PATCH 1/2] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
2015-09-30 14:42   ` Chris Wilson
2015-09-30 15:45   ` Tvrtko Ursulin
2015-09-30 15:53     ` Chris Wilson
2015-10-01 12:33   ` [PATCH v2 (not really v2)] " Michel Thierry
2015-10-01 13:15     ` Daniel Vetter
2015-09-30 14:36 ` [PATCH 2/2] drm/i915/gen8: Flip the 48b switch Michel Thierry
2015-10-01 13:16   ` Daniel Vetter
2015-10-01 14:40     ` Michel Thierry
2015-10-16 12:23     ` Michel Thierry
2015-10-19  9:44       ` Daniel Vetter
2015-10-19 10:01         ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.