All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/6] drm/i915: trace vm eviction instead of everything
@ 2013-09-14 22:03 Ben Widawsky
  2013-09-14 22:03 ` [PATCH 2/6] drm/i915: Provide a cheap ggtt vma lookup Ben Widawsky
                   ` (4 more replies)
  0 siblings, 5 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-14 22:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

Tracing vm eviction is really the event we care about. For the cases we
evict everything, we still will get the trace.

v2: Add the drm device to the trace since we might not be the only
device in the system. (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_evict.c |  2 ++
 drivers/gpu/drm/i915/i915_trace.h     | 15 +++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 3a3981e..b737653 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -175,6 +175,8 @@ int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle)
 	struct i915_vma *vma, *next;
 	int ret;
 
+	trace_i915_gem_evict_vm(vm);
+
 	if (do_idle) {
 		ret = i915_gpu_idle(vm->dev);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index e2c5ee6..403309b 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -233,6 +233,21 @@ TRACE_EVENT(i915_gem_evict_everything,
 	    TP_printk("dev=%d", __entry->dev)
 );
 
+TRACE_EVENT(i915_gem_evict_vm,
+	    TP_PROTO(struct i915_address_space *vm),
+	    TP_ARGS(vm),
+
+	    TP_STRUCT__entry(
+			     __field(struct i915_address_space *, vm)
+			    ),
+
+	    TP_fast_assign(
+			   __entry->vm = vm;
+			  ),
+
+	    TP_printk("dev=%d, vm=%p", __entry->vm->dev->primary->index, __entry->vm)
+);
+
 TRACE_EVENT(i915_gem_ring_dispatch,
 	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags),
 	    TP_ARGS(ring, seqno, flags),
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 2/6] drm/i915: Provide a cheap ggtt vma lookup
  2013-09-14 22:03 [PATCH 1/6] drm/i915: trace vm eviction instead of everything Ben Widawsky
@ 2013-09-14 22:03 ` Ben Widawsky
  2013-09-14 22:03 ` [PATCH 3/6] drm/i915: Convert active API to VMA Ben Widawsky
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-14 22:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

"We do fairly often lookup the ggtt vma for an obj." - Chris Wilson. As
such, provide a function to offer slightly cheaper access to the vma.
Not performance tested. By my quick estimation it saves at least 3
pointer dereferences from the existing mechanism.

This patch mostly matches code from Chris in
<20130911221430.GB7825@nuc-i3427.alporthouse.com>

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h |  4 +++-
 drivers/gpu/drm/i915/i915_gem.c | 17 +++++++++++++++--
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7caf71d..df43f71 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2009,6 +2009,9 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 struct i915_vma *
 i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
 				  struct i915_address_space *vm);
+
+struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj);
+
 /* Some GGTT VM helpers */
 #define obj_to_ggtt(obj) \
 	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
@@ -2045,7 +2048,6 @@ i915_gem_obj_ggtt_pin(struct drm_i915_gem_object *obj,
 	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
 				   map_and_fenceable, nonblocking);
 }
-#undef obj_to_ggtt
 
 /* i915_gem_context.c */
 void i915_gem_context_init(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3d3de6e..83f946c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3403,8 +3403,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 
 	/* And bump the LRU for this access */
 	if (i915_gem_object_is_inactive(obj)) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
-							   &dev_priv->gtt.base);
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
 		if (vma)
 			list_move_tail(&vma->mm_list,
 				       &dev_priv->gtt.base.inactive_list);
@@ -4942,3 +4941,17 @@ unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
 
 	return 0;
 }
+
+struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
+{
+	struct i915_vma *vma;
+
+	if (WARN_ON(list_empty(&obj->vma_list)))
+		return NULL;
+
+	vma = list_first_entry(&obj->vma_list, typeof(*vma), vma_link);
+	if (WARN_ON(vma->vm != obj_to_ggtt(obj)))
+		return NULL;
+
+	return vma;
+}
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 3/6] drm/i915: Convert active API to VMA
  2013-09-14 22:03 [PATCH 1/6] drm/i915: trace vm eviction instead of everything Ben Widawsky
  2013-09-14 22:03 ` [PATCH 2/6] drm/i915: Provide a cheap ggtt vma lookup Ben Widawsky
@ 2013-09-14 22:03 ` Ben Widawsky
  2013-09-14 22:03 ` [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-14 22:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Even though we track object activity and not VMA, because we have the
active_list be based on the VM, it makes the most sense to use VMAs in
the APIs.

NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
now, leave them be.

v2: Remove leftover hunk from the previous patch which didn't keep
i915_gem_object_move_to_active. That patch had to rely on the ring to
get the dev instead of the obj. (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 5 ++---
 drivers/gpu/drm/i915/i915_gem.c            | 9 ++++++++-
 drivers/gpu/drm/i915/i915_gem_context.c    | 5 +----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +--
 4 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index df43f71..427c537 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1884,9 +1884,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
-void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
-				    struct intel_ring_buffer *ring);
-
+void i915_vma_move_to_active(struct i915_vma *vma,
+			     struct intel_ring_buffer *ring);
 int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
 			 struct drm_mode_create_dumb *args);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 83f946c..651b91c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1910,7 +1910,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 	return 0;
 }
 
-void
+static void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 			       struct intel_ring_buffer *ring)
 {
@@ -1949,6 +1949,13 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 }
 
+void i915_vma_move_to_active(struct i915_vma *vma,
+			     struct intel_ring_buffer *ring)
+{
+	list_move_tail(&vma->mm_list, &vma->vm->active_list);
+	return i915_gem_object_move_to_active(vma->obj, ring);
+}
+
 static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 26c3fcc..cb3b7e8 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -436,11 +436,8 @@ static int do_switch(struct i915_hw_context *to)
 	 * MI_SET_CONTEXT instead of when the next seqno has completed.
 	 */
 	if (from != NULL) {
-		struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
-		struct i915_address_space *ggtt = &dev_priv->gtt.base;
 		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-		list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
-		i915_gem_object_move_to_active(from->obj, ring);
+		i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->obj), ring);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
 		 * object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index ee93357..b26d979 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -872,8 +872,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 		obj->base.read_domains = obj->base.pending_read_domains;
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
-		list_move_tail(&vma->mm_list, &vma->vm->active_list);
-		i915_gem_object_move_to_active(obj, ring);
+		i915_vma_move_to_active(vma, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-14 22:03 [PATCH 1/6] drm/i915: trace vm eviction instead of everything Ben Widawsky
  2013-09-14 22:03 ` [PATCH 2/6] drm/i915: Provide a cheap ggtt vma lookup Ben Widawsky
  2013-09-14 22:03 ` [PATCH 3/6] drm/i915: Convert active API to VMA Ben Widawsky
@ 2013-09-14 22:03 ` Ben Widawsky
  2013-09-16  9:25   ` Chris Wilson
  2013-09-14 22:03 ` [PATCH 5/6] drm/i915: Use the new vm [un]bind functions Ben Widawsky
  2013-09-14 22:03 ` [PATCH 6/6] drm/i915: eliminate vm->insert_entries() Ben Widawsky
  4 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-14 22:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding.  This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)

v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.

v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  69 +++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.c | 106 ++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 427c537..686a66c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -465,6 +465,36 @@ enum i915_cache_level {
 
 typedef uint32_t gen6_gtt_pte_t;
 
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+	struct drm_mm_node node;
+	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
+	struct list_head vma_link; /* Link in the object's VMA list */
+
+	/** This vma's place in the batchbuffer or on the eviction list */
+	struct list_head exec_list;
+
+	/**
+	 * Used for performing relocations during execbuffer insertion.
+	 */
+	struct hlist_node exec_node;
+	unsigned long exec_handle;
+	struct drm_i915_gem_exec_object2 *exec_entry;
+
+};
+
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
@@ -503,9 +533,18 @@ struct i915_address_space {
 	/* FIXME: Need a more generic return type */
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
+
+	/** Unmap an object from an address space. This usually consists of
+	 * setting the valid PTE entries to a reserved scratch page. */
+	void (*unbind_vma)(struct i915_vma *vma);
 	void (*clear_range)(struct i915_address_space *vm,
 			    unsigned int first_entry,
 			    unsigned int num_entries);
+	/* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+	void (*bind_vma)(struct i915_vma *vma,
+			 enum i915_cache_level cache_level,
+			 u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
 			       unsigned int first_entry,
@@ -552,36 +591,6 @@ struct i915_hw_ppgtt {
 	int (*enable)(struct drm_device *dev);
 };
 
-/**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
- *
- * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
-	struct drm_mm_node node;
-	struct drm_i915_gem_object *obj;
-	struct i915_address_space *vm;
-
-	/** This object's place on the active/inactive lists */
-	struct list_head mm_list;
-
-	struct list_head vma_link; /* Link in the object's VMA list */
-
-	/** This vma's place in the batchbuffer or on the eviction list */
-	struct list_head exec_list;
-
-	/**
-	 * Used for performing relocations during execbuffer insertion.
-	 */
-	struct hlist_node exec_node;
-	unsigned long exec_handle;
-	struct drm_i915_gem_exec_object2 *exec_entry;
-
-};
-
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 212f6d8..2a71a29 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -57,6 +57,11 @@
 #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
 #define HSW_WT_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0x6)
 
+static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
+				enum i915_cache_level cache_level,
+				u32 flags);
+static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
+
 static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
 				     enum i915_cache_level level)
 {
@@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
+	ppgtt->base.unbind_vma = NULL;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.bind_vma = NULL;
 	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 				   cache_level);
 }
 
+static void __always_unused
+gen6_ppgtt_bind_vma(struct i915_vma *vma,
+		    enum i915_cache_level cache_level,
+		    u32 flags)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	WARN_ON(flags);
+
+	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+}
+
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
@@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 				obj->base.size >> PAGE_SHIFT);
 }
 
+static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	gen6_ppgtt_clear_range(vma->vm, entry,
+			       vma->obj->base.size >> PAGE_SHIFT);
+}
+
 extern int intel_iommu_gfx_mapped;
 /* Certain Gen5 chipsets require require idling the GPU before
  * unmapping anything from the GTT when VT-d is enabled.
@@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 
 }
 
+static void i915_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 unused)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	vma->obj->has_global_gtt_mapping = 1;
+}
+
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
@@ -599,6 +639,47 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
+static void i915_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned int first = vma->node.start >> PAGE_SHIFT;
+	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	vma->obj->has_global_gtt_mapping = 0;
+	intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 flags)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
+	 * the global, just use aliasing */
+	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
+		/* If the object is unbound, or we're change the cache bits */
+		if (!obj->has_global_gtt_mapping ||
+		    (cache_level != obj->cache_level)) {
+			gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
+						 cache_level);
+			obj->has_global_gtt_mapping = 1;
+		}
+	}
+
+	/* If put the mapping in the aliasing PPGTT as well as Global if we have
+	 * aliasing, but the user requested global. */
+	if (dev_priv->mm.aliasing_ppgtt &&
+	    (!obj->has_aliasing_ppgtt_mapping ||
+	     (cache_level != obj->cache_level))) {
+		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+					  vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+	}
+}
 
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 			      enum i915_cache_level cache_level)
@@ -627,6 +708,27 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 	obj->has_global_gtt_mapping = 0;
 }
 
+static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	if (obj->has_global_gtt_mapping) {
+		gen6_ggtt_clear_range(vma->vm, entry,
+				      vma->obj->base.size >> PAGE_SHIFT);
+		obj->has_global_gtt_mapping = 0;
+	}
+
+	if (obj->has_aliasing_ppgtt_mapping) {
+		gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
+				       entry,
+				       obj->base.size >> PAGE_SHIFT);
+		obj->has_aliasing_ppgtt_mapping = 0;
+	}
+}
+
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -860,7 +962,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
 		DRM_ERROR("Scratch setup failed\n");
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
 	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
 
 	return ret;
 }
@@ -892,7 +996,9 @@ static int i915_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
 	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
 
 	return 0;
 }
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 5/6] drm/i915: Use the new vm [un]bind functions
  2013-09-14 22:03 [PATCH 1/6] drm/i915: trace vm eviction instead of everything Ben Widawsky
                   ` (2 preceding siblings ...)
  2013-09-14 22:03 ` [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-09-14 22:03 ` Ben Widawsky
  2013-09-16  7:37   ` Chris Wilson
  2013-09-14 22:03 ` [PATCH 6/6] drm/i915: eliminate vm->insert_entries() Ben Widawsky
  4 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-14 22:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  9 ------
 drivers/gpu/drm/i915/i915_gem.c            | 31 ++++++++-----------
 drivers/gpu/drm/i915/i915_gem_context.c    |  8 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 ++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------------
 5 files changed, 34 insertions(+), 91 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 686a66c..ab88b43 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2085,17 +2085,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 651b91c..4e6f20a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2693,12 +2693,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3424,7 +3420,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3463,11 +3458,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3795,6 +3787,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3823,20 +3816,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cb3b7e8..a030739 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -415,8 +416,11 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
+							   &dev_priv->gtt.base);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b26d979..bfe8cef 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+				 GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -1117,8 +1109,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE &&
+	    !batch_obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
+		vm->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
+	}
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2a71a29..af2080b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -681,33 +666,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 6/6] drm/i915: eliminate vm->insert_entries()
  2013-09-14 22:03 [PATCH 1/6] drm/i915: trace vm eviction instead of everything Ben Widawsky
                   ` (3 preceding siblings ...)
  2013-09-14 22:03 ` [PATCH 5/6] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-09-14 22:03 ` Ben Widawsky
  4 siblings, 0 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-14 22:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index af2080b..09b8aba 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -339,8 +339,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->enable = gen6_ppgtt_enable;
 	ppgtt->base.unbind_vma = NULL;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
-	ppgtt->base.bind_vma = NULL;
 	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
+	ppgtt->base.bind_vma = NULL;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
 	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
@@ -591,19 +591,6 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 	readl(gtt_base);
 }
 
-
-static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     unsigned int pg_start,
-				     enum i915_cache_level cache_level)
-{
-	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
-		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
-
-	intel_gtt_insert_sg_entries(st, pg_start, flags);
-
-}
-
 static void i915_ggtt_bind_vma(struct i915_vma *vma,
 			       enum i915_cache_level cache_level,
 			       u32 unused)
@@ -921,7 +908,6 @@ static int gen6_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
 	dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
-	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
 	dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
 
 	return ret;
@@ -955,7 +941,6 @@ static int i915_gmch_probe(struct drm_device *dev,
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
 	dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
-	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
 	dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
 
 	return 0;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] drm/i915: Use the new vm [un]bind functions
  2013-09-14 22:03 ` [PATCH 5/6] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-09-16  7:37   ` Chris Wilson
  2013-09-16 18:31     ` Ben Widawsky
  2013-09-17 17:01     ` [PATCH 5/6] [v3] " Ben Widawsky
  0 siblings, 2 replies; 55+ messages in thread
From: Chris Wilson @ 2013-09-16  7:37 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Ben Widawsky

On Sat, Sep 14, 2013 at 03:03:18PM -0700, Ben Widawsky wrote:
> @@ -1117,8 +1109,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
>  	 * hsw should have this fixed, but let's be paranoid and do it
>  	 * unconditionally for now. */
> -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> +	if (flags & I915_DISPATCH_SECURE &&
> +	    !batch_obj->has_global_gtt_mapping) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
> +		vm->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> +	}

This should be ggtt rather than vm?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-14 22:03 ` [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-09-16  9:25   ` Chris Wilson
  2013-09-16 18:23     ` Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-16  9:25 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Ben Widawsky

On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote:
> +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> +			       enum i915_cache_level cache_level,
> +			       u32 flags)
> +{
> +	struct drm_device *dev = vma->vm->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_gem_object *obj = vma->obj;
> +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> +	 * the global, just use aliasing */
> +	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
> +		/* If the object is unbound, or we're change the cache bits */
> +		if (!obj->has_global_gtt_mapping ||
> +		    (cache_level != obj->cache_level)) {
> +			gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
> +						 cache_level);
> +			obj->has_global_gtt_mapping = 1;
> +		}
> +	}
> +
> +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> +	 * aliasing, but the user requested global. */

Why? As a proponent of full-ppgtt I thought you would be envisoning a
future where the aliasing_ppgtt was used far less (i.e. never), and the
ggtt would only continue to be used for the truly global entries such as
scanouts, contexts, pdes, execlists etc.

> +	if (dev_priv->mm.aliasing_ppgtt &&
> +	    (!obj->has_aliasing_ppgtt_mapping ||
> +	     (cache_level != obj->cache_level))) {
> +		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> +					  vma->obj->pages, entry, cache_level);
> +		vma->obj->has_aliasing_ppgtt_mapping = 1;
> +	}
> +}

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-16  9:25   ` Chris Wilson
@ 2013-09-16 18:23     ` Ben Widawsky
  2013-09-16 22:05       ` Daniel Vetter
  2013-09-16 22:13       ` Chris Wilson
  0 siblings, 2 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-16 18:23 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, intel-gfx

On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote:
> On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote:
> > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > +			       enum i915_cache_level cache_level,
> > +			       u32 flags)
> > +{
> > +	struct drm_device *dev = vma->vm->dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_gem_object *obj = vma->obj;
> > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > +	 * the global, just use aliasing */
> > +	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
> > +		/* If the object is unbound, or we're change the cache bits */
> > +		if (!obj->has_global_gtt_mapping ||
> > +		    (cache_level != obj->cache_level)) {
> > +			gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
> > +						 cache_level);
> > +			obj->has_global_gtt_mapping = 1;
> > +		}
> > +	}
> > +
> > +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> > +	 * aliasing, but the user requested global. */
> 
> Why? As a proponent of full-ppgtt I thought you would be envisoning a
> future where the aliasing_ppgtt was used far less (i.e. never), and the
> ggtt would only continue to be used for the truly global entries such as
> scanouts, contexts, pdes, execlists etc.
> 

Firstly, I've still yet to expose the grand plan at this point in the
series, so I am not really certain if you're just complaining for the
fun of it, or what. I'd like to make everything functionally the same,
just with VMA support.

Secondly, I was under the impression that for Sandybridge we had to have
all global mappings in the aliasing to support PIPE_CONTROL, or some
command like that. It's a bit mixed up in my head atm, and I'm too lazy
to look at the exact reason.

Finally, see firstly. I'll try to rip it out later on if it's possible.

[snip]

> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] drm/i915: Use the new vm [un]bind functions
  2013-09-16  7:37   ` Chris Wilson
@ 2013-09-16 18:31     ` Ben Widawsky
  2013-09-17 17:01     ` [PATCH 5/6] [v3] " Ben Widawsky
  1 sibling, 0 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-16 18:31 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, intel-gfx

On Mon, Sep 16, 2013 at 08:37:05AM +0100, Chris Wilson wrote:
> On Sat, Sep 14, 2013 at 03:03:18PM -0700, Ben Widawsky wrote:
> > @@ -1117,8 +1109,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> >  	 * hsw should have this fixed, but let's be paranoid and do it
> >  	 * unconditionally for now. */
> > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > +	if (flags & I915_DISPATCH_SECURE &&
> > +	    !batch_obj->has_global_gtt_mapping) {
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
> > +		vm->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > +	}
> 
> This should be ggtt rather than vm?
> -Chris

Yep. Thanks. I will also use your new requested helper function.
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-16 18:23     ` Ben Widawsky
@ 2013-09-16 22:05       ` Daniel Vetter
  2013-09-16 22:18         ` Chris Wilson
  2013-09-16 22:13       ` Chris Wilson
  1 sibling, 1 reply; 55+ messages in thread
From: Daniel Vetter @ 2013-09-16 22:05 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Ben Widawsky

On Mon, Sep 16, 2013 at 11:23:43AM -0700, Ben Widawsky wrote:
> On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote:
> > On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote:
> > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > > +			       enum i915_cache_level cache_level,
> > > +			       u32 flags)
> > > +{
> > > +	struct drm_device *dev = vma->vm->dev;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct drm_i915_gem_object *obj = vma->obj;
> > > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +
> > > +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > > +	 * the global, just use aliasing */
> > > +	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
> > > +		/* If the object is unbound, or we're change the cache bits */
> > > +		if (!obj->has_global_gtt_mapping ||
> > > +		    (cache_level != obj->cache_level)) {
> > > +			gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
> > > +						 cache_level);
> > > +			obj->has_global_gtt_mapping = 1;
> > > +		}
> > > +	}
> > > +
> > > +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> > > +	 * aliasing, but the user requested global. */
> > 
> > Why? As a proponent of full-ppgtt I thought you would be envisoning a
> > future where the aliasing_ppgtt was used far less (i.e. never), and the
> > ggtt would only continue to be used for the truly global entries such as
> > scanouts, contexts, pdes, execlists etc.
> > 
> 
> Firstly, I've still yet to expose the grand plan at this point in the
> series, so I am not really certain if you're just complaining for the
> fun of it, or what. I'd like to make everything functionally the same,
> just with VMA support.
> 
> Secondly, I was under the impression that for Sandybridge we had to have
> all global mappings in the aliasing to support PIPE_CONTROL, or some
> command like that. It's a bit mixed up in my head atm, and I'm too lazy
> to look at the exact reason.
> 
> Finally, see firstly. I'll try to rip it out later on if it's possible.

Ben's right afaik, we need this kludge to keep snb happy. And we need
ppgtt to make kernel cs scanning possible, which we seem to need for
geomtry shaders or some other gl3.2/3 feature. So not much choice I'd say
...

-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-16 18:23     ` Ben Widawsky
  2013-09-16 22:05       ` Daniel Vetter
@ 2013-09-16 22:13       ` Chris Wilson
  2013-09-17  5:44         ` Ben Widawsky
  1 sibling, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-16 22:13 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Ben Widawsky

On Mon, Sep 16, 2013 at 11:23:43AM -0700, Ben Widawsky wrote:
> On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote:
> > On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote:
> > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > > +			       enum i915_cache_level cache_level,
> > > +			       u32 flags)
> > > +{
> > > +	struct drm_device *dev = vma->vm->dev;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct drm_i915_gem_object *obj = vma->obj;
> > > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +
> > > +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > > +	 * the global, just use aliasing */
> > > +	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
> > > +		/* If the object is unbound, or we're change the cache bits */
> > > +		if (!obj->has_global_gtt_mapping ||
> > > +		    (cache_level != obj->cache_level)) {
> > > +			gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
> > > +						 cache_level);
> > > +			obj->has_global_gtt_mapping = 1;
> > > +		}
> > > +	}
> > > +
> > > +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> > > +	 * aliasing, but the user requested global. */
> > 
> > Why? As a proponent of full-ppgtt I thought you would be envisoning a
> > future where the aliasing_ppgtt was used far less (i.e. never), and the
> > ggtt would only continue to be used for the truly global entries such as
> > scanouts, contexts, pdes, execlists etc.
> > 
> 
> Firstly, I've still yet to expose the grand plan at this point in the
> series, so I am not really certain if you're just complaining for the
> fun of it, or what. I'd like to make everything functionally the same,
> just with VMA support.

I'm complaining because the comment is awful: telling me what the code
is doing but not why. It doesn't seem obvious that if the user
explicitly wanted a global mapping and that the object is not already in
aliasing ppgtt that it is likely to be used in the aliasing ppgtt in the
near future.

> Secondly, I was under the impression that for Sandybridge we had to have
> all global mappings in the aliasing to support PIPE_CONTROL, or some
> command like that. It's a bit mixed up in my head atm, and I'm too lazy
> to look at the exact reason.

It does, but if we never enable full-ppgtt for SNB we don't have to
worry about full-ppgtt being unusable for OpenGL (at least not without a
1:1 ppgtt to global mapping of all oq objects).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-16 22:05       ` Daniel Vetter
@ 2013-09-16 22:18         ` Chris Wilson
  2013-09-16 22:20           ` Daniel Vetter
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-16 22:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Ben Widawsky, intel-gfx, Ben Widawsky

On Tue, Sep 17, 2013 at 12:05:46AM +0200, Daniel Vetter wrote:
> On Mon, Sep 16, 2013 at 11:23:43AM -0700, Ben Widawsky wrote:
> > On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote:
> > > On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote:
> > > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > > > +			       enum i915_cache_level cache_level,
> > > > +			       u32 flags)
> > > > +{
> > > > +	struct drm_device *dev = vma->vm->dev;
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > +	struct drm_i915_gem_object *obj = vma->obj;
> > > > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > > +
> > > > +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > > > +	 * the global, just use aliasing */
> > > > +	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
> > > > +		/* If the object is unbound, or we're change the cache bits */
> > > > +		if (!obj->has_global_gtt_mapping ||
> > > > +		    (cache_level != obj->cache_level)) {
> > > > +			gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
> > > > +						 cache_level);
> > > > +			obj->has_global_gtt_mapping = 1;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> > > > +	 * aliasing, but the user requested global. */
> > > 
> > > Why? As a proponent of full-ppgtt I thought you would be envisoning a
> > > future where the aliasing_ppgtt was used far less (i.e. never), and the
> > > ggtt would only continue to be used for the truly global entries such as
> > > scanouts, contexts, pdes, execlists etc.
> > > 
> > 
> > Firstly, I've still yet to expose the grand plan at this point in the
> > series, so I am not really certain if you're just complaining for the
> > fun of it, or what. I'd like to make everything functionally the same,
> > just with VMA support.
> > 
> > Secondly, I was under the impression that for Sandybridge we had to have
> > all global mappings in the aliasing to support PIPE_CONTROL, or some
> > command like that. It's a bit mixed up in my head atm, and I'm too lazy
> > to look at the exact reason.
> > 
> > Finally, see firstly. I'll try to rip it out later on if it's possible.
> 
> Ben's right afaik, we need this kludge to keep snb happy. And we need
> ppgtt to make kernel cs scanning possible, which we seem to need for
> geomtry shaders or some other gl3.2/3 feature. So not much choice I'd say
> ...

It shouldn't be a kludge here, but in execbuffer where we detect the SNB
w/a and so should pass the flag down to the bind and make sure we have any
other fixup required (see the extra details required for full-ppgtt).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-16 22:18         ` Chris Wilson
@ 2013-09-16 22:20           ` Daniel Vetter
  0 siblings, 0 replies; 55+ messages in thread
From: Daniel Vetter @ 2013-09-16 22:20 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Ben Widawsky, Ben Widawsky, intel-gfx

On Tue, Sep 17, 2013 at 12:18 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> Ben's right afaik, we need this kludge to keep snb happy. And we need
>> ppgtt to make kernel cs scanning possible, which we seem to need for
>> geomtry shaders or some other gl3.2/3 feature. So not much choice I'd say
>> ...
>
> It shouldn't be a kludge here, but in execbuffer where we detect the SNB
> w/a and so should pass the flag down to the bind and make sure we have any
> other fixup required (see the extra details required for full-ppgtt).

Yeah, I've written my response without actually reading the code
really, only the discussion, and gotten all confused. So I'll hide in
shame a bit and let you two duke this out ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-16 22:13       ` Chris Wilson
@ 2013-09-17  5:44         ` Ben Widawsky
  2013-09-17  7:49           ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-17  5:44 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, intel-gfx

On Mon, Sep 16, 2013 at 11:13:02PM +0100, Chris Wilson wrote:
> On Mon, Sep 16, 2013 at 11:23:43AM -0700, Ben Widawsky wrote:
> > On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote:
> > > On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote:
> > > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > > > +			       enum i915_cache_level cache_level,
> > > > +			       u32 flags)
> > > > +{
> > > > +	struct drm_device *dev = vma->vm->dev;
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > +	struct drm_i915_gem_object *obj = vma->obj;
> > > > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > > +
> > > > +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > > > +	 * the global, just use aliasing */
> > > > +	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
> > > > +		/* If the object is unbound, or we're change the cache bits */
> > > > +		if (!obj->has_global_gtt_mapping ||
> > > > +		    (cache_level != obj->cache_level)) {
> > > > +			gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
> > > > +						 cache_level);
> > > > +			obj->has_global_gtt_mapping = 1;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> > > > +	 * aliasing, but the user requested global. */
> > > 
> > > Why? As a proponent of full-ppgtt I thought you would be envisoning a
> > > future where the aliasing_ppgtt was used far less (i.e. never), and the
> > > ggtt would only continue to be used for the truly global entries such as
> > > scanouts, contexts, pdes, execlists etc.
> > > 
> > 
> > Firstly, I've still yet to expose the grand plan at this point in the
> > series, so I am not really certain if you're just complaining for the
> > fun of it, or what. I'd like to make everything functionally the same,
> > just with VMA support.
> 
> I'm complaining because the comment is awful: telling me what the code
> is doing but not why. It doesn't seem obvious that if the user
> explicitly wanted a global mapping and that the object is not already in
> aliasing ppgtt that it is likely to be used in the aliasing ppgtt in the
> near future.
> 
> > Secondly, I was under the impression that for Sandybridge we had to have
> > all global mappings in the aliasing to support PIPE_CONTROL, or some
> > command like that. It's a bit mixed up in my head atm, and I'm too lazy
> > to look at the exact reason.
> 
> It does, but if we never enable full-ppgtt for SNB we don't have to
> worry about full-ppgtt being unusable for OpenGL (at least not without a
> 1:1 ppgtt to global mapping of all oq objects).
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

I'm sorry. After reading my comments again, you're absolutely right.

How's this?

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2a71a29..fcf36ae 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -658,10 +658,17 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
        struct drm_i915_gem_object *obj = vma->obj;
        const unsigned long entry = vma->node.start >> PAGE_SHIFT;
 
-       /* If there is an aliasing PPGTT, and the user didn't explicitly ask for
-        * the global, just use aliasing */
+       /* If there is no aliasing PPGTT, or the caller needs a global mapping,
+        * or we have a global mapping already but the cacheability flags have
+        * changed, set the global PTES.
+        *
+        * If there is an aliasing PPGTT it is anecdotally faster, so use that
+        * instead if none of the above hold true.
+        *
+        * NB: A global mapping should only be needed for special regions like
+        * "gtt mappable", SNB errata, or if specified via special execbuf flags
+        */
        if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
-               /* If the object is unbound, or we're change the cache bits */
                if (!obj->has_global_gtt_mapping ||
                    (cache_level != obj->cache_level)) {
                        gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
@@ -670,8 +677,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
                }
        }
 
-       /* If put the mapping in the aliasing PPGTT as well as Global if we have
-        * aliasing, but the user requested global. */
        if (dev_priv->mm.aliasing_ppgtt &&
            (!obj->has_aliasing_ppgtt_mapping ||
             (cache_level != obj->cache_level))) {


-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM
  2013-09-17  5:44         ` Ben Widawsky
@ 2013-09-17  7:49           ` Chris Wilson
  2013-09-17 17:00             ` [PATCH 4/6] [v5] " Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-17  7:49 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Ben Widawsky

On Mon, Sep 16, 2013 at 10:44:29PM -0700, Ben Widawsky wrote:
> I'm sorry. After reading my comments again, you're absolutely right.
> 
> How's this?

I'm liking it better, since it gives some insight into why GLOBAL_BIND
is required.

> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 2a71a29..fcf36ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -658,10 +658,17 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
>         struct drm_i915_gem_object *obj = vma->obj;
>         const unsigned long entry = vma->node.start >> PAGE_SHIFT;
>  
> -       /* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> -        * the global, just use aliasing */
> +       /* If there is no aliasing PPGTT, or the caller needs a global mapping,
> +        * or we have a global mapping already but the cacheability flags have
> +        * changed, set the global PTES.
> +        *
> +        * If there is an aliasing PPGTT it is anecdotally faster, so use that
> +        * instead if none of the above hold true.
> +        *
> +        * NB: A global mapping should only be needed for special regions like
> +        * "gtt mappable", SNB errata, or if specified via special execbuf flags

+ At all other times, the GPU will use the aliasing ppgtt.

Just adds that extra bit of stress that the predominant access is
through the aliasing ppgtt.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 4/6] [v5] drm/i915: Add bind/unbind object functions to VM
  2013-09-17  7:49           ` Chris Wilson
@ 2013-09-17 17:00             ` Ben Widawsky
  0 siblings, 0 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-17 17:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding.  This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)

v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.

v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)

v5: Update the comment to not suck (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  69 ++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.c | 112 ++++++++++++++++++++++++++++++++++++
 2 files changed, 151 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 427c537..686a66c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -465,6 +465,36 @@ enum i915_cache_level {
 
 typedef uint32_t gen6_gtt_pte_t;
 
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+	struct drm_mm_node node;
+	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
+	struct list_head vma_link; /* Link in the object's VMA list */
+
+	/** This vma's place in the batchbuffer or on the eviction list */
+	struct list_head exec_list;
+
+	/**
+	 * Used for performing relocations during execbuffer insertion.
+	 */
+	struct hlist_node exec_node;
+	unsigned long exec_handle;
+	struct drm_i915_gem_exec_object2 *exec_entry;
+
+};
+
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
@@ -503,9 +533,18 @@ struct i915_address_space {
 	/* FIXME: Need a more generic return type */
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
+
+	/** Unmap an object from an address space. This usually consists of
+	 * setting the valid PTE entries to a reserved scratch page. */
+	void (*unbind_vma)(struct i915_vma *vma);
 	void (*clear_range)(struct i915_address_space *vm,
 			    unsigned int first_entry,
 			    unsigned int num_entries);
+	/* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+	void (*bind_vma)(struct i915_vma *vma,
+			 enum i915_cache_level cache_level,
+			 u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
 			       unsigned int first_entry,
@@ -552,36 +591,6 @@ struct i915_hw_ppgtt {
 	int (*enable)(struct drm_device *dev);
 };
 
-/**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
- *
- * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
-	struct drm_mm_node node;
-	struct drm_i915_gem_object *obj;
-	struct i915_address_space *vm;
-
-	/** This object's place on the active/inactive lists */
-	struct list_head mm_list;
-
-	struct list_head vma_link; /* Link in the object's VMA list */
-
-	/** This vma's place in the batchbuffer or on the eviction list */
-	struct list_head exec_list;
-
-	/**
-	 * Used for performing relocations during execbuffer insertion.
-	 */
-	struct hlist_node exec_node;
-	unsigned long exec_handle;
-	struct drm_i915_gem_exec_object2 *exec_entry;
-
-};
-
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 212f6d8..0ea40b3 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -57,6 +57,11 @@
 #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
 #define HSW_WT_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0x6)
 
+static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
+				enum i915_cache_level cache_level,
+				u32 flags);
+static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
+
 static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
 				     enum i915_cache_level level)
 {
@@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
+	ppgtt->base.unbind_vma = NULL;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.bind_vma = NULL;
 	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 				   cache_level);
 }
 
+static void __always_unused
+gen6_ppgtt_bind_vma(struct i915_vma *vma,
+		    enum i915_cache_level cache_level,
+		    u32 flags)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	WARN_ON(flags);
+
+	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+}
+
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
@@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 				obj->base.size >> PAGE_SHIFT);
 }
 
+static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	gen6_ppgtt_clear_range(vma->vm, entry,
+			       vma->obj->base.size >> PAGE_SHIFT);
+}
+
 extern int intel_iommu_gfx_mapped;
 /* Certain Gen5 chipsets require require idling the GPU before
  * unmapping anything from the GTT when VT-d is enabled.
@@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 
 }
 
+static void i915_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 unused)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	vma->obj->has_global_gtt_mapping = 1;
+}
+
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
@@ -599,6 +639,53 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
+static void i915_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned int first = vma->node.start >> PAGE_SHIFT;
+	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	vma->obj->has_global_gtt_mapping = 0;
+	intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 flags)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	/* If there is no aliasing PPGTT, or the caller needs a global mapping,
+	 * or we have a global mapping already but the cacheability flags have
+	 * changed, set the global PTEs.
+	 *
+	 * If there is an aliasing PPGTT it is anecdotally faster, so use that
+	 * instead if none of the above hold true.
+	 *
+	 * NB: A global mapping should only be needed for special regions like
+	 * "gtt mappable", SNB errata, or if specified via special execbuf
+	 * flags. At all other times, the GPU will use the aliasing PPGTT.
+	 */
+	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
+		if (!obj->has_global_gtt_mapping ||
+		    (cache_level != obj->cache_level)) {
+			gen6_ggtt_insert_entries(vma->vm, obj->pages, entry,
+						 cache_level);
+			obj->has_global_gtt_mapping = 1;
+		}
+	}
+
+	if (dev_priv->mm.aliasing_ppgtt &&
+	    (!obj->has_aliasing_ppgtt_mapping ||
+	     (cache_level != obj->cache_level))) {
+		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+					  vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+	}
+}
 
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 			      enum i915_cache_level cache_level)
@@ -627,6 +714,27 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 	obj->has_global_gtt_mapping = 0;
 }
 
+static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	if (obj->has_global_gtt_mapping) {
+		gen6_ggtt_clear_range(vma->vm, entry,
+				      vma->obj->base.size >> PAGE_SHIFT);
+		obj->has_global_gtt_mapping = 0;
+	}
+
+	if (obj->has_aliasing_ppgtt_mapping) {
+		gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
+				       entry,
+				       obj->base.size >> PAGE_SHIFT);
+		obj->has_aliasing_ppgtt_mapping = 0;
+	}
+}
+
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -860,7 +968,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
 		DRM_ERROR("Scratch setup failed\n");
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
 	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
 
 	return ret;
 }
@@ -892,7 +1002,9 @@ static int i915_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
 	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
 
 	return 0;
 }
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-16  7:37   ` Chris Wilson
  2013-09-16 18:31     ` Ben Widawsky
@ 2013-09-17 17:01     ` Ben Widawsky
  2013-09-17 20:55       ` Chris Wilson
  1 sibling, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-17 17:01 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  9 ------
 drivers/gpu/drm/i915/i915_gem.c            | 31 ++++++++-----------
 drivers/gpu/drm/i915/i915_gem_context.c    |  8 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 31 +++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------------
 5 files changed, 36 insertions(+), 91 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 686a66c..ab88b43 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2085,17 +2085,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 651b91c..4e6f20a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2693,12 +2693,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3424,7 +3420,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3463,11 +3458,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3795,6 +3787,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3823,20 +3816,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cb3b7e8..a030739 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -415,8 +416,11 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
+							   &dev_priv->gtt.base);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b26d979..b85e2dc 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+				 GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -1117,8 +1109,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE &&
+	    !batch_obj->has_global_gtt_mapping) {
+		const struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
+		BUG_ON(!vma);
+		ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
+	}
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ea40b3..7e4a308 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-17 17:01     ` [PATCH 5/6] [v3] " Ben Widawsky
@ 2013-09-17 20:55       ` Chris Wilson
  2013-09-17 23:14         ` Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-17 20:55 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Tue, Sep 17, 2013 at 10:01:33AM -0700, Ben Widawsky wrote:
> @@ -1117,8 +1109,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
>  	 * hsw should have this fixed, but let's be paranoid and do it
>  	 * unconditionally for now. */
> -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> +	if (flags & I915_DISPATCH_SECURE &&
> +	    !batch_obj->has_global_gtt_mapping) {
> +		const struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> +		struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> +		BUG_ON(!vma);
> +		ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> +	}

The issue here is that if we don't set the USE_PPGTT/USE_SECURE flag in
the dispatch, the CS will use the GGTT (hence our binding) but so we
then need to use the GGTT offset for the dispatch as well.

Is that as concisely as we can write bind_to_ggtt? :(
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-17 20:55       ` Chris Wilson
@ 2013-09-17 23:14         ` Ben Widawsky
  2013-09-17 23:33           ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-17 23:14 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Tue, Sep 17, 2013 at 09:55:35PM +0100, Chris Wilson wrote:
> On Tue, Sep 17, 2013 at 10:01:33AM -0700, Ben Widawsky wrote:
> > @@ -1117,8 +1109,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> >  	 * hsw should have this fixed, but let's be paranoid and do it
> >  	 * unconditionally for now. */
> > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > +	if (flags & I915_DISPATCH_SECURE &&
> > +	    !batch_obj->has_global_gtt_mapping) {
> > +		const struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > +		struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> > +		BUG_ON(!vma);
> > +		ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > +	}
> 
> The issue here is that if we don't set the USE_PPGTT/USE_SECURE flag in
> the dispatch, the CS will use the GGTT (hence our binding) but so we
> then need to use the GGTT offset for the dispatch as well.
> 
> Is that as concisely as we can write bind_to_ggtt? :(
> -Chris
> 

Resuming the conversation started on irc... what do you want from me?

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-17 23:14         ` Ben Widawsky
@ 2013-09-17 23:33           ` Chris Wilson
  2013-09-17 23:48             ` Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-17 23:33 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Tue, Sep 17, 2013 at 04:14:43PM -0700, Ben Widawsky wrote:
> On Tue, Sep 17, 2013 at 09:55:35PM +0100, Chris Wilson wrote:
> > On Tue, Sep 17, 2013 at 10:01:33AM -0700, Ben Widawsky wrote:
> > > @@ -1117,8 +1109,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> > >  	 * hsw should have this fixed, but let's be paranoid and do it
> > >  	 * unconditionally for now. */
> > > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > > +	if (flags & I915_DISPATCH_SECURE &&
> > > +	    !batch_obj->has_global_gtt_mapping) {
> > > +		const struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > > +		struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> > > +		BUG_ON(!vma);
> > > +		ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > > +	}
> > 
> > The issue here is that if we don't set the USE_PPGTT/USE_SECURE flag in
> > the dispatch, the CS will use the GGTT (hence our binding) but so we
> > then need to use the GGTT offset for the dispatch as well.
> > 
> > Is that as concisely as we can write bind_to_ggtt? :(
> > -Chris
> > 
> 
> Resuming the conversation started on irc... what do you want from me?

I think we need to pass the ggtt offset to dispatch for
I915_DISPATCH_SECURE -- which offset to use might even depend upon the
implementation and hw generation in intel_ringbuffer.c. But at the very
least, I think SNB/IVB will be executing the wrong address come full
ppgtt.

dev_priv->ggtt.base.bind_vma(i915_gem_obj_to_ggtt(batch_obj),
                             batch_obj->cache_level,
			     GLOBAL_BIND);

#define i915_vm_bind(vm__, vma__, cache_level__, flags__) \
 (vm__)->bind_vma((vma__), (cache_level__), (flags__))

i915_vm_bind(&dev_priv->ggtt.base,
             i915_gem_obj_to_ggtt(batch_obj),
	     batch_obj->cache_level,
	     GLOBAL_BIND);
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-17 23:33           ` Chris Wilson
@ 2013-09-17 23:48             ` Ben Widawsky
  2013-09-17 23:57               ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-17 23:48 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Wed, Sep 18, 2013 at 12:33:32AM +0100, Chris Wilson wrote:
> On Tue, Sep 17, 2013 at 04:14:43PM -0700, Ben Widawsky wrote:
> > On Tue, Sep 17, 2013 at 09:55:35PM +0100, Chris Wilson wrote:
> > > On Tue, Sep 17, 2013 at 10:01:33AM -0700, Ben Widawsky wrote:
> > > > @@ -1117,8 +1109,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> > > >  	 * hsw should have this fixed, but let's be paranoid and do it
> > > >  	 * unconditionally for now. */
> > > > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > > > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > > > +	if (flags & I915_DISPATCH_SECURE &&
> > > > +	    !batch_obj->has_global_gtt_mapping) {
> > > > +		const struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > > > +		struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> > > > +		BUG_ON(!vma);
> > > > +		ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > > > +	}
> > > 
> > > The issue here is that if we don't set the USE_PPGTT/USE_SECURE flag in
> > > the dispatch, the CS will use the GGTT (hence our binding) but so we
> > > then need to use the GGTT offset for the dispatch as well.
> > > 
> > > Is that as concisely as we can write bind_to_ggtt? :(
> > > -Chris
> > > 
> > 
> > Resuming the conversation started on irc... what do you want from me?
> 
> I think we need to pass the ggtt offset to dispatch for
> I915_DISPATCH_SECURE -- which offset to use might even depend upon the
> implementation and hw generation in intel_ringbuffer.c. But at the very
> least, I think SNB/IVB will be executing the wrong address come full
> ppgtt.
> 
> dev_priv->ggtt.base.bind_vma(i915_gem_obj_to_ggtt(batch_obj),
>                              batch_obj->cache_level,
> 			     GLOBAL_BIND);
> 
> #define i915_vm_bind(vm__, vma__, cache_level__, flags__) \
>  (vm__)->bind_vma((vma__), (cache_level__), (flags__))
> 
> i915_vm_bind(&dev_priv->ggtt.base,
>              i915_gem_obj_to_ggtt(batch_obj),
> 	     batch_obj->cache_level,
> 	     GLOBAL_BIND);
> -Chris
> 

I915_DISPATCH_SECURE is a special case. If we see the flag, we look up
the GGTT offset as opposed to the offset in the VM being used at
execbuf. We can either bind the batchbuffer into both the PPGTT, and
GGTT, or it's only even in the GGTT - in either case, we'll have to have
done the bind (and found space in the drm_mm). It just seems like this
is duplicating the already existing i915_gem_object_bind_to_vm code
that's in place.

Sorry if I am not following what you're asking. I'm just failing to see
a problem, or maybe you're just trying to solve problems that I haven't
yet conceived; or solved in a different way.  It's pretty darn hard to
discuss this given the piecemeal nature of the thing.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-17 23:48             ` Ben Widawsky
@ 2013-09-17 23:57               ` Chris Wilson
  2013-09-18  0:02                 ` Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-17 23:57 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Tue, Sep 17, 2013 at 04:48:50PM -0700, Ben Widawsky wrote:
> On Wed, Sep 18, 2013 at 12:33:32AM +0100, Chris Wilson wrote:
> > On Tue, Sep 17, 2013 at 04:14:43PM -0700, Ben Widawsky wrote:
> > > On Tue, Sep 17, 2013 at 09:55:35PM +0100, Chris Wilson wrote:
> > > > On Tue, Sep 17, 2013 at 10:01:33AM -0700, Ben Widawsky wrote:
> > > > > @@ -1117,8 +1109,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > > >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> > > > >  	 * hsw should have this fixed, but let's be paranoid and do it
> > > > >  	 * unconditionally for now. */
> > > > > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > > > > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > > > > +	if (flags & I915_DISPATCH_SECURE &&
> > > > > +	    !batch_obj->has_global_gtt_mapping) {
> > > > > +		const struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > > > > +		struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> > > > > +		BUG_ON(!vma);
> > > > > +		ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > > > > +	}
> > > > 
> > > > The issue here is that if we don't set the USE_PPGTT/USE_SECURE flag in
> > > > the dispatch, the CS will use the GGTT (hence our binding) but so we
> > > > then need to use the GGTT offset for the dispatch as well.
> > > > 
> > > > Is that as concisely as we can write bind_to_ggtt? :(
> > > > -Chris
> > > > 
> > > 
> > > Resuming the conversation started on irc... what do you want from me?
> > 
> > I think we need to pass the ggtt offset to dispatch for
> > I915_DISPATCH_SECURE -- which offset to use might even depend upon the
> > implementation and hw generation in intel_ringbuffer.c. But at the very
> > least, I think SNB/IVB will be executing the wrong address come full
> > ppgtt.
> > 
> > dev_priv->ggtt.base.bind_vma(i915_gem_obj_to_ggtt(batch_obj),
> >                              batch_obj->cache_level,
> > 			     GLOBAL_BIND);
> > 
> > #define i915_vm_bind(vm__, vma__, cache_level__, flags__) \
> >  (vm__)->bind_vma((vma__), (cache_level__), (flags__))
> > 
> > i915_vm_bind(&dev_priv->ggtt.base,
> >              i915_gem_obj_to_ggtt(batch_obj),
> > 	     batch_obj->cache_level,
> > 	     GLOBAL_BIND);
> > -Chris
> > 
> 
> I915_DISPATCH_SECURE is a special case. If we see the flag, we look up
> the GGTT offset as opposed to the offset in the VM being used at
> execbuf. We can either bind the batchbuffer into both the PPGTT, and
> GGTT, or it's only even in the GGTT - in either case, we'll have to have
> done the bind (and found space in the drm_mm). It just seems like this
> is duplicating the already existing i915_gem_object_bind_to_vm code
> that's in place.
> 
> Sorry if I am not following what you're asking. I'm just failing to see
> a problem, or maybe you're just trying to solve problems that I haven't
> yet conceived; or solved in a different way.  It's pretty darn hard to
> discuss this given the piecemeal nature of the thing.

The code does

	exec_start = i915_gem_obj_offset(batch_obj, vm) +
			args->batch_start_offset;
	exec_len = args->batch_len;
	...
	ret = ring->dispatch_execbuffer(ring,
					exec_start, exec_len,
					flags);
	if (ret)
		goto err;

So we lookup the address of the batch buffer in the wrong vm for
I915_DISPATCH_SECURE.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-17 23:57               ` Chris Wilson
@ 2013-09-18  0:02                 ` Ben Widawsky
  2013-09-18  8:30                   ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-18  0:02 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Wed, Sep 18, 2013 at 12:57:20AM +0100, Chris Wilson wrote:
> On Tue, Sep 17, 2013 at 04:48:50PM -0700, Ben Widawsky wrote:
> > On Wed, Sep 18, 2013 at 12:33:32AM +0100, Chris Wilson wrote:
> > > On Tue, Sep 17, 2013 at 04:14:43PM -0700, Ben Widawsky wrote:
> > > > On Tue, Sep 17, 2013 at 09:55:35PM +0100, Chris Wilson wrote:
> > > > > On Tue, Sep 17, 2013 at 10:01:33AM -0700, Ben Widawsky wrote:
> > > > > > @@ -1117,8 +1109,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > > > >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> > > > > >  	 * hsw should have this fixed, but let's be paranoid and do it
> > > > > >  	 * unconditionally for now. */
> > > > > > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > > > > > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > > > > > +	if (flags & I915_DISPATCH_SECURE &&
> > > > > > +	    !batch_obj->has_global_gtt_mapping) {
> > > > > > +		const struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > > > > > +		struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> > > > > > +		BUG_ON(!vma);
> > > > > > +		ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > > > > > +	}
> > > > > 
> > > > > The issue here is that if we don't set the USE_PPGTT/USE_SECURE flag in
> > > > > the dispatch, the CS will use the GGTT (hence our binding) but so we
> > > > > then need to use the GGTT offset for the dispatch as well.
> > > > > 
> > > > > Is that as concisely as we can write bind_to_ggtt? :(
> > > > > -Chris
> > > > > 
> > > > 
> > > > Resuming the conversation started on irc... what do you want from me?
> > > 
> > > I think we need to pass the ggtt offset to dispatch for
> > > I915_DISPATCH_SECURE -- which offset to use might even depend upon the
> > > implementation and hw generation in intel_ringbuffer.c. But at the very
> > > least, I think SNB/IVB will be executing the wrong address come full
> > > ppgtt.
> > > 
> > > dev_priv->ggtt.base.bind_vma(i915_gem_obj_to_ggtt(batch_obj),
> > >                              batch_obj->cache_level,
> > > 			     GLOBAL_BIND);
> > > 
> > > #define i915_vm_bind(vm__, vma__, cache_level__, flags__) \
> > >  (vm__)->bind_vma((vma__), (cache_level__), (flags__))
> > > 
> > > i915_vm_bind(&dev_priv->ggtt.base,
> > >              i915_gem_obj_to_ggtt(batch_obj),
> > > 	     batch_obj->cache_level,
> > > 	     GLOBAL_BIND);
> > > -Chris
> > > 
> > 
> > I915_DISPATCH_SECURE is a special case. If we see the flag, we look up
> > the GGTT offset as opposed to the offset in the VM being used at
> > execbuf. We can either bind the batchbuffer into both the PPGTT, and
> > GGTT, or it's only even in the GGTT - in either case, we'll have to have
> > done the bind (and found space in the drm_mm). It just seems like this
> > is duplicating the already existing i915_gem_object_bind_to_vm code
> > that's in place.
> > 
> > Sorry if I am not following what you're asking. I'm just failing to see
> > a problem, or maybe you're just trying to solve problems that I haven't
> > yet conceived; or solved in a different way.  It's pretty darn hard to
> > discuss this given the piecemeal nature of the thing.
> 
> The code does
> 
> 	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> 			args->batch_start_offset;
> 	exec_len = args->batch_len;
> 	...
> 	ret = ring->dispatch_execbuffer(ring,
> 					exec_start, exec_len,
> 					flags);
> 	if (ret)
> 		goto err;
> 
> So we lookup the address of the batch buffer in the wrong vm for
> I915_DISPATCH_SECURE.
> -Chris
> 

But this is very easily solved, no?

http://cgit.freedesktop.org/~bwidawsk/drm-intel/tree/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=ppgtt#n1083

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18  0:02                 ` Ben Widawsky
@ 2013-09-18  8:30                   ` Chris Wilson
  2013-09-18 14:47                     ` Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-18  8:30 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Tue, Sep 17, 2013 at 05:02:03PM -0700, Ben Widawsky wrote:
> > The code does
> > 
> > 	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > 			args->batch_start_offset;
> > 	exec_len = args->batch_len;
> > 	...
> > 	ret = ring->dispatch_execbuffer(ring,
> > 					exec_start, exec_len,
> > 					flags);
> > 	if (ret)
> > 		goto err;
> > 
> > So we lookup the address of the batch buffer in the wrong vm for
> > I915_DISPATCH_SECURE.
> > -Chris
> > 
> 
> But this is very easily solved, no?
> 
> http://cgit.freedesktop.org/~bwidawsk/drm-intel/tree/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=ppgtt#n1083

No, just because the batch once had a ggtt entry doesn't mean the CS is
going to use the ggtt for this execution...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18  8:30                   ` Chris Wilson
@ 2013-09-18 14:47                     ` Ben Widawsky
  2013-09-18 14:53                       ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-18 14:47 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Wed, Sep 18, 2013 at 09:30:17AM +0100, Chris Wilson wrote:
> On Tue, Sep 17, 2013 at 05:02:03PM -0700, Ben Widawsky wrote:
> > > The code does
> > > 
> > > 	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > 			args->batch_start_offset;
> > > 	exec_len = args->batch_len;
> > > 	...
> > > 	ret = ring->dispatch_execbuffer(ring,
> > > 					exec_start, exec_len,
> > > 					flags);
> > > 	if (ret)
> > > 		goto err;
> > > 
> > > So we lookup the address of the batch buffer in the wrong vm for
> > > I915_DISPATCH_SECURE.
> > > -Chris
> > > 
> > 
> > But this is very easily solved, no?
> > 
> > http://cgit.freedesktop.org/~bwidawsk/drm-intel/tree/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=ppgtt#n1083
> 
> No, just because the batch once had a ggtt entry doesn't mean the CS is
> going to use the ggtt for this execution...
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

I guess your use of the term, "CS" here is a bit confusing to me, since
in my head the CS better do whatever we tell it to do.

If you're trying to say, "just because batch_obj has a ggtt binding;
that doesn't necessarily mean it's the one to use at dispatch." I think
that statement is true, but it's still pretty simple to solve, just use
the I915_DISPATCH_SECURE flag to check instead of
obj->has_global_gtt_mapping. Right?

I'm really sorry about being so dense here.

As a side note, I tried really hard to think of how we could end up with
a ggtt mapping for batch_obj, and not want to use that one. I'm not
actually sure it's possible, but I can't prove it as such, so I'm
willing to assume it is possible. Excluding SNB, so few objects actually
will get a ggtt mapping, I don't believe any of them should be reused
for a batch BO - however, IGT can probably make it happen.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18 14:47                     ` Ben Widawsky
@ 2013-09-18 14:53                       ` Chris Wilson
  2013-09-18 15:48                         ` Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-18 14:53 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Wed, Sep 18, 2013 at 07:47:45AM -0700, Ben Widawsky wrote:
> On Wed, Sep 18, 2013 at 09:30:17AM +0100, Chris Wilson wrote:
> > On Tue, Sep 17, 2013 at 05:02:03PM -0700, Ben Widawsky wrote:
> > > > The code does
> > > > 
> > > > 	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > 			args->batch_start_offset;
> > > > 	exec_len = args->batch_len;
> > > > 	...
> > > > 	ret = ring->dispatch_execbuffer(ring,
> > > > 					exec_start, exec_len,
> > > > 					flags);
> > > > 	if (ret)
> > > > 		goto err;
> > > > 
> > > > So we lookup the address of the batch buffer in the wrong vm for
> > > > I915_DISPATCH_SECURE.
> > > > -Chris
> > > > 
> > > 
> > > But this is very easily solved, no?
> > > 
> > > http://cgit.freedesktop.org/~bwidawsk/drm-intel/tree/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=ppgtt#n1083
> > 
> > No, just because the batch once had a ggtt entry doesn't mean the CS is
> > going to use the ggtt for this execution...
> > -Chris
> > 
> > -- 
> > Chris Wilson, Intel Open Source Technology Centre
> 
> I guess your use of the term, "CS" here is a bit confusing to me, since
> in my head the CS better do whatever we tell it to do.

Exactly, we tell the CS to use the ggtt for the SECURE batch (at least
on some generations).
 
> If you're trying to say, "just because batch_obj has a ggtt binding;
> that doesn't necessarily mean it's the one to use at dispatch." I think
> that statement is true, but it's still pretty simple to solve, just use
> the I915_DISPATCH_SECURE flag to check instead of
> obj->has_global_gtt_mapping. Right?

Yes. With the same caveat that it may change.
 
> I'm really sorry about being so dense here.
> 
> As a side note, I tried really hard to think of how we could end up with
> a ggtt mapping for batch_obj, and not want to use that one. I'm not
> actually sure it's possible, but I can't prove it as such, so I'm
> willing to assume it is possible. Excluding SNB, so few objects actually
> will get a ggtt mapping, I don't believe any of them should be reused
> for a batch BO - however, IGT can probably make it happen.

It's trivial for a batch to end up with a ggtt entry - userspace can
just access it through the GTT. Or any bo prior to reusing it as a
batch.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18 14:53                       ` Chris Wilson
@ 2013-09-18 15:48                         ` Ben Widawsky
  2013-09-18 15:59                           ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-18 15:48 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Wed, Sep 18, 2013 at 03:53:37PM +0100, Chris Wilson wrote:
> On Wed, Sep 18, 2013 at 07:47:45AM -0700, Ben Widawsky wrote:
> > On Wed, Sep 18, 2013 at 09:30:17AM +0100, Chris Wilson wrote:
> > > On Tue, Sep 17, 2013 at 05:02:03PM -0700, Ben Widawsky wrote:
> > > > > The code does
> > > > > 
> > > > > 	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > > 			args->batch_start_offset;
> > > > > 	exec_len = args->batch_len;
> > > > > 	...
> > > > > 	ret = ring->dispatch_execbuffer(ring,
> > > > > 					exec_start, exec_len,
> > > > > 					flags);
> > > > > 	if (ret)
> > > > > 		goto err;
> > > > > 
> > > > > So we lookup the address of the batch buffer in the wrong vm for
> > > > > I915_DISPATCH_SECURE.
> > > > > -Chris
> > > > > 
> > > > 
> > > > But this is very easily solved, no?
> > > > 
> > > > http://cgit.freedesktop.org/~bwidawsk/drm-intel/tree/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=ppgtt#n1083
> > > 
> > > No, just because the batch once had a ggtt entry doesn't mean the CS is
> > > going to use the ggtt for this execution...
> > > -Chris
> > > 
> > > -- 
> > > Chris Wilson, Intel Open Source Technology Centre
> > 
> > I guess your use of the term, "CS" here is a bit confusing to me, since
> > in my head the CS better do whatever we tell it to do.
> 
> Exactly, we tell the CS to use the ggtt for the SECURE batch (at least
> on some generations).
>  
> > If you're trying to say, "just because batch_obj has a ggtt binding;
> > that doesn't necessarily mean it's the one to use at dispatch." I think
> > that statement is true, but it's still pretty simple to solve, just use
> > the I915_DISPATCH_SECURE flag to check instead of
> > obj->has_global_gtt_mapping. Right?
> 
> Yes. With the same caveat that it may change.

What may change? Dispatch secure always means use the GGTT offset, does
it not? Or do you think we'll want privileged batches running from the
PPGTT? If the latter is true, why ever use GGTT?

>  
> > I'm really sorry about being so dense here.
> > 
> > As a side note, I tried really hard to think of how we could end up with
> > a ggtt mapping for batch_obj, and not want to use that one. I'm not
> > actually sure it's possible, but I can't prove it as such, so I'm
> > willing to assume it is possible. Excluding SNB, so few objects actually
> > will get a ggtt mapping, I don't believe any of them should be reused
> > for a batch BO - however, IGT can probably make it happen.
> 
> It's trivial for a batch to end up with a ggtt entry - userspace can
> just access it through the GTT. Or any bo prior to reusing it as a
> batch.
> -Chris

Trivial, perhaps on the gtt mapping. It's not really relevant, but is
any userspace currently doing that? As for BO reuse, that's a separate
problem - are we handing back BOs with their mappings intact? That seems
like a security problem.

> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18 15:48                         ` Ben Widawsky
@ 2013-09-18 15:59                           ` Chris Wilson
  2013-09-18 16:11                             ` Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-18 15:59 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Wed, Sep 18, 2013 at 08:48:45AM -0700, Ben Widawsky wrote:
> On Wed, Sep 18, 2013 at 03:53:37PM +0100, Chris Wilson wrote:
> > On Wed, Sep 18, 2013 at 07:47:45AM -0700, Ben Widawsky wrote:
> > > On Wed, Sep 18, 2013 at 09:30:17AM +0100, Chris Wilson wrote:
> > > > On Tue, Sep 17, 2013 at 05:02:03PM -0700, Ben Widawsky wrote:
> > > > > > The code does
> > > > > > 
> > > > > > 	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > > > 			args->batch_start_offset;
> > > > > > 	exec_len = args->batch_len;
> > > > > > 	...
> > > > > > 	ret = ring->dispatch_execbuffer(ring,
> > > > > > 					exec_start, exec_len,
> > > > > > 					flags);
> > > > > > 	if (ret)
> > > > > > 		goto err;
> > > > > > 
> > > > > > So we lookup the address of the batch buffer in the wrong vm for
> > > > > > I915_DISPATCH_SECURE.
> > > > > > -Chris
> > > > > > 
> > > > > 
> > > > > But this is very easily solved, no?
> > > > > 
> > > > > http://cgit.freedesktop.org/~bwidawsk/drm-intel/tree/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=ppgtt#n1083
> > > > 
> > > > No, just because the batch once had a ggtt entry doesn't mean the CS is
> > > > going to use the ggtt for this execution...
> > > > -Chris
> > > > 
> > > > -- 
> > > > Chris Wilson, Intel Open Source Technology Centre
> > > 
> > > I guess your use of the term, "CS" here is a bit confusing to me, since
> > > in my head the CS better do whatever we tell it to do.
> > 
> > Exactly, we tell the CS to use the ggtt for the SECURE batch (at least
> > on some generations).
> >  
> > > If you're trying to say, "just because batch_obj has a ggtt binding;
> > > that doesn't necessarily mean it's the one to use at dispatch." I think
> > > that statement is true, but it's still pretty simple to solve, just use
> > > the I915_DISPATCH_SECURE flag to check instead of
> > > obj->has_global_gtt_mapping. Right?
> > 
> > Yes. With the same caveat that it may change.
> 
> What may change? Dispatch secure always means use the GGTT offset, does
> it not? Or do you think we'll want privileged batches running from the
> PPGTT? If the latter is true, why ever use GGTT?

The security bit is already independent from the use-ppgtt bit. With
Haswell it should be possible to execute a privileged batch buffer from
a ppgtt address, right? In which case we would not need to allocate a
GGTT entry.
 
> > > I'm really sorry about being so dense here.
> > > 
> > > As a side note, I tried really hard to think of how we could end up with
> > > a ggtt mapping for batch_obj, and not want to use that one. I'm not
> > > actually sure it's possible, but I can't prove it as such, so I'm
> > > willing to assume it is possible. Excluding SNB, so few objects actually
> > > will get a ggtt mapping, I don't believe any of them should be reused
> > > for a batch BO - however, IGT can probably make it happen.
> > 
> > It's trivial for a batch to end up with a ggtt entry - userspace can
> > just access it through the GTT. Or any bo prior to reusing it as a
> > batch.
> > -Chris
> 
> Trivial, perhaps on the gtt mapping. It's not really relevant, but is
> any userspace currently doing that? As for BO reuse, that's a separate
> problem - are we handing back BOs with their mappings intact? That seems
> like a security problem.

*Userspace* caches its bo with the mappings intact.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18 15:59                           ` Chris Wilson
@ 2013-09-18 16:11                             ` Ben Widawsky
  2013-09-18 16:15                               ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-18 16:11 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Wed, Sep 18, 2013 at 04:59:01PM +0100, Chris Wilson wrote:
> On Wed, Sep 18, 2013 at 08:48:45AM -0700, Ben Widawsky wrote:
> > On Wed, Sep 18, 2013 at 03:53:37PM +0100, Chris Wilson wrote:
> > > On Wed, Sep 18, 2013 at 07:47:45AM -0700, Ben Widawsky wrote:
> > > > On Wed, Sep 18, 2013 at 09:30:17AM +0100, Chris Wilson wrote:
> > > > > On Tue, Sep 17, 2013 at 05:02:03PM -0700, Ben Widawsky wrote:
> > > > > > > The code does
> > > > > > > 
> > > > > > > 	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > > > > 			args->batch_start_offset;
> > > > > > > 	exec_len = args->batch_len;
> > > > > > > 	...
> > > > > > > 	ret = ring->dispatch_execbuffer(ring,
> > > > > > > 					exec_start, exec_len,
> > > > > > > 					flags);
> > > > > > > 	if (ret)
> > > > > > > 		goto err;
> > > > > > > 
> > > > > > > So we lookup the address of the batch buffer in the wrong vm for
> > > > > > > I915_DISPATCH_SECURE.
> > > > > > > -Chris
> > > > > > > 
> > > > > > 
> > > > > > But this is very easily solved, no?
> > > > > > 
> > > > > > http://cgit.freedesktop.org/~bwidawsk/drm-intel/tree/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=ppgtt#n1083
> > > > > 
> > > > > No, just because the batch once had a ggtt entry doesn't mean the CS is
> > > > > going to use the ggtt for this execution...
> > > > > -Chris
> > > > > 
> > > > > -- 
> > > > > Chris Wilson, Intel Open Source Technology Centre
> > > > 
> > > > I guess your use of the term, "CS" here is a bit confusing to me, since
> > > > in my head the CS better do whatever we tell it to do.
> > > 
> > > Exactly, we tell the CS to use the ggtt for the SECURE batch (at least
> > > on some generations).
> > >  
> > > > If you're trying to say, "just because batch_obj has a ggtt binding;
> > > > that doesn't necessarily mean it's the one to use at dispatch." I think
> > > > that statement is true, but it's still pretty simple to solve, just use
> > > > the I915_DISPATCH_SECURE flag to check instead of
> > > > obj->has_global_gtt_mapping. Right?
> > > 
> > > Yes. With the same caveat that it may change.
> > 
> > What may change? Dispatch secure always means use the GGTT offset, does
> > it not? Or do you think we'll want privileged batches running from the
> > PPGTT? If the latter is true, why ever use GGTT?
> 
> The security bit is already independent from the use-ppgtt bit. With
> Haswell it should be possible to execute a privileged batch buffer from
> a ppgtt address, right? In which case we would not need to allocate a
> GGTT entry.
>  

Right, that was my point. But I *still* fail to see how your earlier
request does anything to help this along. The decision can still easily
be made at any given time with the I915_DISPATCH_SECURE flag, and per
platform driver policy. Say, if you wanted HSW to always run privileged
batches out of PPGTT. OTOH, if we need to pass a flag down to specify
which address space to execute the batch out of, maybe some more hoops
need to be jumped through. I don't see a reason to do this however, and
if we want to support IVB, we have to support GGTT execution anyway - so
I'm not really sure of a benefit to building in support for PPGTT
privileged execution.


> > > > I'm really sorry about being so dense here.
> > > > 
> > > > As a side note, I tried really hard to think of how we could end up with
> > > > a ggtt mapping for batch_obj, and not want to use that one. I'm not
> > > > actually sure it's possible, but I can't prove it as such, so I'm
> > > > willing to assume it is possible. Excluding SNB, so few objects actually
> > > > will get a ggtt mapping, I don't believe any of them should be reused
> > > > for a batch BO - however, IGT can probably make it happen.
> > > 
> > > It's trivial for a batch to end up with a ggtt entry - userspace can
> > > just access it through the GTT. Or any bo prior to reusing it as a
> > > batch.
> > > -Chris
> > 
> > Trivial, perhaps on the gtt mapping. It's not really relevant, but is
> > any userspace currently doing that? As for BO reuse, that's a separate
> > problem - are we handing back BOs with their mappings intact? That seems
> > like a security problem.
> 
> *Userspace* caches its bo with the mappings intact.
> -Chris

Yes, this seems like a potential (albeit small) problem to me if we (the
kernel) arbitrarily upgrade BOs to a GGTT mapping. I guess everything
running privileged is trusted though, so we don't need to worry about
the unintentional global BOs being snooped. It does sort of seem to
circumvent real PPGTT to some extent though if the global mappings
linger.

Let me state that at this point of the thread, I am lost. Do you still
want the original change you asked for? I still don't understand, or see
a reason for not just quirking away with a quick check (which I'll state
again doesn't even matter until patches which haven't yet been written
are posted) - but I clearly haven't been able to convince you; and
nobody else is stepping in.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18 16:11                             ` Ben Widawsky
@ 2013-09-18 16:15                               ` Chris Wilson
  2013-09-18 16:20                                 ` Daniel Vetter
  2013-09-19  0:12                                 ` [PATCH] [v4] " Ben Widawsky
  0 siblings, 2 replies; 55+ messages in thread
From: Chris Wilson @ 2013-09-18 16:15 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Wed, Sep 18, 2013 at 09:11:56AM -0700, Ben Widawsky wrote:
> On Wed, Sep 18, 2013 at 04:59:01PM +0100, Chris Wilson wrote:
> > On Wed, Sep 18, 2013 at 08:48:45AM -0700, Ben Widawsky wrote:
> > > On Wed, Sep 18, 2013 at 03:53:37PM +0100, Chris Wilson wrote:
> > > > On Wed, Sep 18, 2013 at 07:47:45AM -0700, Ben Widawsky wrote:
> > > > > On Wed, Sep 18, 2013 at 09:30:17AM +0100, Chris Wilson wrote:
> > > > > > On Tue, Sep 17, 2013 at 05:02:03PM -0700, Ben Widawsky wrote:
> > > > > > > > The code does
> > > > > > > > 
> > > > > > > > 	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > > > > > 			args->batch_start_offset;
> > > > > > > > 	exec_len = args->batch_len;
> > > > > > > > 	...
> > > > > > > > 	ret = ring->dispatch_execbuffer(ring,
> > > > > > > > 					exec_start, exec_len,
> > > > > > > > 					flags);
> > > > > > > > 	if (ret)
> > > > > > > > 		goto err;
> > > > > > > > 
> > > > > > > > So we lookup the address of the batch buffer in the wrong vm for
> > > > > > > > I915_DISPATCH_SECURE.
> > > > > > > > -Chris
> > > > > > > > 
> > > > > > > 
> > > > > > > But this is very easily solved, no?
> > > > > > > 
> > > > > > > http://cgit.freedesktop.org/~bwidawsk/drm-intel/tree/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=ppgtt#n1083
> > > > > > 
> > > > > > No, just because the batch once had a ggtt entry doesn't mean the CS is
> > > > > > going to use the ggtt for this execution...
> > > > > > -Chris
> > > > > > 
> > > > > > -- 
> > > > > > Chris Wilson, Intel Open Source Technology Centre
> > > > > 
> > > > > I guess your use of the term, "CS" here is a bit confusing to me, since
> > > > > in my head the CS better do whatever we tell it to do.
> > > > 
> > > > Exactly, we tell the CS to use the ggtt for the SECURE batch (at least
> > > > on some generations).
> > > >  
> > > > > If you're trying to say, "just because batch_obj has a ggtt binding;
> > > > > that doesn't necessarily mean it's the one to use at dispatch." I think
> > > > > that statement is true, but it's still pretty simple to solve, just use
> > > > > the I915_DISPATCH_SECURE flag to check instead of
> > > > > obj->has_global_gtt_mapping. Right?
> > > > 
> > > > Yes. With the same caveat that it may change.
> > > 
> > > What may change? Dispatch secure always means use the GGTT offset, does
> > > it not? Or do you think we'll want privileged batches running from the
> > > PPGTT? If the latter is true, why ever use GGTT?
> > 
> > The security bit is already independent from the use-ppgtt bit. With
> > Haswell it should be possible to execute a privileged batch buffer from
> > a ppgtt address, right? In which case we would not need to allocate a
> > GGTT entry.
> >  
> 
> Right, that was my point. But I *still* fail to see how your earlier
> request does anything to help this along. The decision can still easily
> be made at any given time with the I915_DISPATCH_SECURE flag, and per
> platform driver policy. Say, if you wanted HSW to always run privileged
> batches out of PPGTT. OTOH, if we need to pass a flag down to specify
> which address space to execute the batch out of, maybe some more hoops
> need to be jumped through. I don't see a reason to do this however, and
> if we want to support IVB, we have to support GGTT execution anyway - so
> I'm not really sure of a benefit to building in support for PPGTT
> privileged execution.
> 
> 
> > > > > I'm really sorry about being so dense here.
> > > > > 
> > > > > As a side note, I tried really hard to think of how we could end up with
> > > > > a ggtt mapping for batch_obj, and not want to use that one. I'm not
> > > > > actually sure it's possible, but I can't prove it as such, so I'm
> > > > > willing to assume it is possible. Excluding SNB, so few objects actually
> > > > > will get a ggtt mapping, I don't believe any of them should be reused
> > > > > for a batch BO - however, IGT can probably make it happen.
> > > > 
> > > > It's trivial for a batch to end up with a ggtt entry - userspace can
> > > > just access it through the GTT. Or any bo prior to reusing it as a
> > > > batch.
> > > > -Chris
> > > 
> > > Trivial, perhaps on the gtt mapping. It's not really relevant, but is
> > > any userspace currently doing that? As for BO reuse, that's a separate
> > > problem - are we handing back BOs with their mappings intact? That seems
> > > like a security problem.
> > 
> > *Userspace* caches its bo with the mappings intact.
> > -Chris
> 
> Yes, this seems like a potential (albeit small) problem to me if we (the
> kernel) arbitrarily upgrade BOs to a GGTT mapping. I guess everything
> running privileged is trusted though, so we don't need to worry about
> the unintentional global BOs being snooped. It does sort of seem to
> circumvent real PPGTT to some extent though if the global mappings
> linger.
> 
> Let me state that at this point of the thread, I am lost. Do you still
> want the original change you asked for? I still don't understand, or see
> a reason for not just quirking away with a quick check (which I'll state
> again doesn't even matter until patches which haven't yet been written
> are posted) - but I clearly haven't been able to convince you; and
> nobody else is stepping in.

Yes, I want the bug in the code fixed.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18 16:15                               ` Chris Wilson
@ 2013-09-18 16:20                                 ` Daniel Vetter
  2013-09-18 16:37                                   ` Ben Widawsky
  2013-09-19  0:12                                 ` [PATCH] [v4] " Ben Widawsky
  1 sibling, 1 reply; 55+ messages in thread
From: Daniel Vetter @ 2013-09-18 16:20 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Ben Widawsky, Intel GFX

On Wed, Sep 18, 2013 at 6:15 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Yes, I want the bug in the code fixed.

I guess what Ben's trying to say is that right now we don't yet have a
bug (since we lack the ppgtt address space). But I agree that the fix
Ben pointed at in this thread of using obj->has_global_mapping won't
work, we need to pick the address space for the batch offset according
to the SECURE_DISPATCH flag. And we also need to make sure that we
actually have the global mapping around.

Aside: Batch security and ppgtt aren't even fully untangled on hsw,
afaik only on the render ring do we have seperate bits.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 5/6] [v3] drm/i915: Use the new vm [un]bind functions
  2013-09-18 16:20                                 ` Daniel Vetter
@ 2013-09-18 16:37                                   ` Ben Widawsky
  0 siblings, 0 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-18 16:37 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Ben Widawsky

On Wed, Sep 18, 2013 at 06:20:23PM +0200, Daniel Vetter wrote:
> On Wed, Sep 18, 2013 at 6:15 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > Yes, I want the bug in the code fixed.
> 
> I guess what Ben's trying to say is that right now we don't yet have a
> bug (since we lack the ppgtt address space). But I agree that the fix
> Ben pointed at in this thread of using obj->has_global_mapping won't
> work, we need to pick the address space for the batch offset according
> to the SECURE_DISPATCH flag. And we also need to make sure that we
> actually have the global mapping around.
> 
> Aside: Batch security and ppgtt aren't even fully untangled on hsw,
> afaik only on the render ring do we have seperate bits.
> -Daniel

I see it now. He was trying to solve my trivial bug with the grander
longer term solution. As he requested, I'll just fix the bug for now,
and we can worry about multiple VM support later.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH] [v4] drm/i915: Use the new vm [un]bind functions
  2013-09-18 16:15                               ` Chris Wilson
  2013-09-18 16:20                                 ` Daniel Vetter
@ 2013-09-19  0:12                                 ` Ben Widawsky
  2013-09-19  9:13                                   ` Chris Wilson
  1 sibling, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-19  0:12 UTC (permalink / raw)
  To: Intel GFX, Chris Wilson; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

v4: Make the code support the secure dispatch flag, which requires
special handling during execbuf. This was fixed (incorrectly) later in
the series, but having it here earlier in the series should be perfectly
acceptable. (Chris)
Move do_switch over to the new, special ggtt_vma interface.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 15 ++++------
 drivers/gpu/drm/i915/i915_gem.c            | 39 +++++++++---------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  7 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 48 +++++++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------------
 5 files changed, 56 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 686a66c..8172eb1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1862,6 +1862,12 @@ int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
+int __must_check
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 int __must_check i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj);
@@ -2085,17 +2091,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 651b91c..4f8dc67 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -43,12 +43,6 @@ static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *o
 static __must_check int
 i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 			       bool readonly);
-static __must_check int
-i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
-			   struct i915_address_space *vm,
-			   unsigned alignment,
-			   bool map_and_fenceable,
-			   bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -2693,12 +2687,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3155,7 +3145,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 /**
  * Finds free space in the GTT aperture and binds the object there.
  */
-static int
+int
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
@@ -3424,7 +3414,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3463,11 +3452,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3795,6 +3781,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3823,20 +3810,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cb3b7e8..b2989f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -415,8 +416,10 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b26d979..cfc9c9d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+				 GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -936,7 +928,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct intel_ring_buffer *ring;
 	struct i915_ctx_hang_stats *hs;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
-	u32 exec_start, exec_len;
+	u32 exec_len, exec_start = args->batch_start_offset;
 	u32 mask, flags;
 	int ret, mode, i;
 	bool need_relocs;
@@ -1117,8 +1109,26 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE) {
+		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
+		/* Assuming all privileged batches are in the global GTT means
+		 * we need to make sure we have a global gtt offset, as well as
+		 * the PTEs mapped. As mentioned above, we can forego this on
+		 * HSW, but don't.
+		 */
+		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
+						 false);
+		if (ret)
+			goto err;
+
+		if (!batch_obj->has_global_gtt_mapping) {
+			struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
+			BUG_ON(!vma);
+			ggtt->bind_vma(vma, batch_obj->cache_level,
+				       GLOBAL_BIND);
+		}
+	} else
+		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
@@ -1160,8 +1170,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_offset(batch_obj, vm) +
-		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ea40b3..7e4a308 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH] [v4] drm/i915: Use the new vm [un]bind functions
  2013-09-19  0:12                                 ` [PATCH] [v4] " Ben Widawsky
@ 2013-09-19  9:13                                   ` Chris Wilson
  2013-09-19 14:15                                     ` [PATCH] [v5] " Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-19  9:13 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Wed, Sep 18, 2013 at 05:12:37PM -0700, Ben Widawsky wrote:
> @@ -1117,8 +1109,26 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
>  	 * hsw should have this fixed, but let's be paranoid and do it
>  	 * unconditionally for now. */
> -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> +	if (flags & I915_DISPATCH_SECURE) {
> +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> +		/* Assuming all privileged batches are in the global GTT means
> +		 * we need to make sure we have a global gtt offset, as well as
> +		 * the PTEs mapped. As mentioned above, we can forego this on
> +		 * HSW, but don't.
> +		 */
> +		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
> +						 false);
> +		if (ret)
> +			goto err;
> +
> +		if (!batch_obj->has_global_gtt_mapping) {
> +			struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
> +			BUG_ON(!vma);
> +			ggtt->bind_vma(vma, batch_obj->cache_level,
> +				       GLOBAL_BIND);
> +		}

Just loose the vma =..., BUG_ON(vma == NULL), it makes it less readable.
And I blame that loss of readability for the bug here.

> +	} else
> +		exec_start += i915_gem_obj_offset(batch_obj, vm);
>  
>  	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
>  	if (ret)

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH] [v5] drm/i915: Use the new vm [un]bind functions
  2013-09-19  9:13                                   ` Chris Wilson
@ 2013-09-19 14:15                                     ` Ben Widawsky
  2013-09-19 14:26                                       ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-19 14:15 UTC (permalink / raw)
  To: Intel GFX, Chris Wilson; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

v4: Make the code support the secure dispatch flag, which requires
special handling during execbuf. This was fixed (incorrectly) later in
the series, but having it here earlier in the series should be perfectly
acceptable. (Chris)
Move do_switch over to the new, special ggtt_vma interface.

v5: Don't use a local variable (or assertion) when setting the batch
object to the global GTT during secure dispatch (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 15 ++++------
 drivers/gpu/drm/i915/i915_gem.c            | 39 +++++++++---------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  7 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 46 +++++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------------
 5 files changed, 54 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 686a66c..8172eb1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1862,6 +1862,12 @@ int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
+int __must_check
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 int __must_check i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj);
@@ -2085,17 +2091,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 651b91c..4f8dc67 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -43,12 +43,6 @@ static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *o
 static __must_check int
 i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 			       bool readonly);
-static __must_check int
-i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
-			   struct i915_address_space *vm,
-			   unsigned alignment,
-			   bool map_and_fenceable,
-			   bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -2693,12 +2687,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3155,7 +3145,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 /**
  * Finds free space in the GTT aperture and binds the object there.
  */
-static int
+int
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
@@ -3424,7 +3414,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3463,11 +3452,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3795,6 +3781,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3823,20 +3810,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cb3b7e8..b2989f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -415,8 +416,10 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b26d979..c370a2b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+				 GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -936,7 +928,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct intel_ring_buffer *ring;
 	struct i915_ctx_hang_stats *hs;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
-	u32 exec_start, exec_len;
+	u32 exec_len, exec_start = args->batch_start_offset;
 	u32 mask, flags;
 	int ret, mode, i;
 	bool need_relocs;
@@ -1117,8 +1109,24 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE) {
+		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
+		/* Assuming all privileged batches are in the global GTT means
+		 * we need to make sure we have a global gtt offset, as well as
+		 * the PTEs mapped. As mentioned above, we can forego this on
+		 * HSW, but don't.
+		 */
+		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
+						 false);
+		if (ret)
+			goto err;
+
+		if (!batch_obj->has_global_gtt_mapping)
+			ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
+				       batch_obj->cache_level,
+				       GLOBAL_BIND);
+	} else
+		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
@@ -1160,8 +1168,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_offset(batch_obj, vm) +
-		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ea40b3..7e4a308 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH] [v5] drm/i915: Use the new vm [un]bind functions
  2013-09-19 14:15                                     ` [PATCH] [v5] " Ben Widawsky
@ 2013-09-19 14:26                                       ` Chris Wilson
  2013-09-19 14:41                                         ` [PATCH] [v6] " Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-19 14:26 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Thu, Sep 19, 2013 at 07:15:47AM -0700, Ben Widawsky wrote:
> @@ -1117,8 +1109,24 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
>  	 * hsw should have this fixed, but let's be paranoid and do it
>  	 * unconditionally for now. */
> -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> +	if (flags & I915_DISPATCH_SECURE) {
> +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> +		/* Assuming all privileged batches are in the global GTT means
> +		 * we need to make sure we have a global gtt offset, as well as
> +		 * the PTEs mapped. As mentioned above, we can forego this on
> +		 * HSW, but don't.
> +		 */
> +		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
> +						 false);
> +		if (ret)
> +			goto err;
> +
> +		if (!batch_obj->has_global_gtt_mapping)
> +			ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
> +				       batch_obj->cache_level,
> +				       GLOBAL_BIND);

+ exec_start += i915_gem_obj_ggtt_offset(batch_obj, vm);

> +	} else
> +		exec_start += i915_gem_obj_offset(batch_obj, vm);
>  
>  	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
>  	if (ret)

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH] [v6] drm/i915: Use the new vm [un]bind functions
  2013-09-19 14:26                                       ` Chris Wilson
@ 2013-09-19 14:41                                         ` Ben Widawsky
  2013-09-19 14:45                                           ` [PATCH] [v7] " Ben Widawsky
  2013-09-19 14:47                                           ` [PATCH] [v6] " Chris Wilson
  0 siblings, 2 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-19 14:41 UTC (permalink / raw)
  To: Intel GFX, Chris Wilson; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

v4: Make the code support the secure dispatch flag, which requires
special handling during execbuf. This was fixed (incorrectly) later in
the series, but having it here earlier in the series should be perfectly
acceptable. (Chris)
Move do_switch over to the new, special ggtt_vma interface.

v5: Don't use a local variable (or assertion) when setting the batch
object to the global GTT during secure dispatch (Chris)

v6: Caclulate the exec offset for the secure case (Bug fix missed on
v4). (Chris)
Remove redundant check for has_global_gtt_mapping, since it is done in
bind_vma.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 15 ++++------
 drivers/gpu/drm/i915/i915_gem.c            | 39 +++++++++---------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  7 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 47 ++++++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------------
 5 files changed, 55 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 686a66c..8172eb1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1862,6 +1862,12 @@ int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
+int __must_check
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 int __must_check i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj);
@@ -2085,17 +2091,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 651b91c..4f8dc67 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -43,12 +43,6 @@ static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *o
 static __must_check int
 i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 			       bool readonly);
-static __must_check int
-i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
-			   struct i915_address_space *vm,
-			   unsigned alignment,
-			   bool map_and_fenceable,
-			   bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -2693,12 +2687,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3155,7 +3145,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 /**
  * Finds free space in the GTT aperture and binds the object there.
  */
-static int
+int
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
@@ -3424,7 +3414,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3463,11 +3452,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3795,6 +3781,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3823,20 +3810,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cb3b7e8..b2989f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -415,8 +416,10 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b26d979..e57837c 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+				 GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -936,7 +928,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct intel_ring_buffer *ring;
 	struct i915_ctx_hang_stats *hs;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
-	u32 exec_start, exec_len;
+	u32 exec_len, exec_start = args->batch_start_offset;
 	u32 mask, flags;
 	int ret, mode, i;
 	bool need_relocs;
@@ -1117,8 +1109,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE) {
+		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
+		/* Assuming all privileged batches are in the global GTT means
+		 * we need to make sure we have a global gtt offset, as well as
+		 * the PTEs mapped. As mentioned above, we can forego this on
+		 * HSW, but don't.
+		 */
+		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
+						 false);
+		if (ret)
+			goto err;
+
+		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
+			       batch_obj->cache_level,
+			       GLOBAL_BIND);
+
+		exec_start += i915_gem_obj_ggtt_offset(batch_obj, vm);
+	} else
+		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
@@ -1160,8 +1169,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_offset(batch_obj, vm) +
-		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ea40b3..7e4a308 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH] [v7] drm/i915: Use the new vm [un]bind functions
  2013-09-19 14:41                                         ` [PATCH] [v6] " Ben Widawsky
@ 2013-09-19 14:45                                           ` Ben Widawsky
  2013-09-20  4:06                                             ` [PATCH] " Ben Widawsky
  2013-09-19 14:47                                           ` [PATCH] [v6] " Chris Wilson
  1 sibling, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-19 14:45 UTC (permalink / raw)
  To: Intel GFX, Chris Wilson; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

v4: Make the code support the secure dispatch flag, which requires
special handling during execbuf. This was fixed (incorrectly) later in
the series, but having it here earlier in the series should be perfectly
acceptable. (Chris)
Move do_switch over to the new, special ggtt_vma interface.

v5: Don't use a local variable (or assertion) when setting the batch
object to the global GTT during secure dispatch (Chris)

v6: Caclulate the exec offset for the secure case (Bug fix missed on
v4). (Chris)
Remove redundant check for has_global_gtt_mapping, since it is done in
bind_vma.

v7: Remove now unused dev_priv in do_switch
Don't pass the vm to ggtt_offset (error from v6 which I should have
caught before sending).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 15 ++++------
 drivers/gpu/drm/i915/i915_gem.c            | 39 +++++++++---------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  6 ++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 47 ++++++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------------
 5 files changed, 54 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 686a66c..8172eb1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1862,6 +1862,12 @@ int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
+int __must_check
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 int __must_check i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj);
@@ -2085,17 +2091,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 651b91c..4f8dc67 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -43,12 +43,6 @@ static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *o
 static __must_check int
 i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 			       bool readonly);
-static __must_check int
-i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
-			   struct i915_address_space *vm,
-			   unsigned alignment,
-			   bool map_and_fenceable,
-			   bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -2693,12 +2687,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3155,7 +3145,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 /**
  * Finds free space in the GTT aperture and binds the object there.
  */
-static int
+int
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
@@ -3424,7 +3414,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3463,11 +3452,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3795,6 +3781,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3823,20 +3810,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cb3b7e8..2e7416f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -415,8 +415,10 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b26d979..138abc1 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+				 GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +465,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -497,14 +499,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -515,9 +509,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -936,7 +928,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct intel_ring_buffer *ring;
 	struct i915_ctx_hang_stats *hs;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
-	u32 exec_start, exec_len;
+	u32 exec_len, exec_start = args->batch_start_offset;
 	u32 mask, flags;
 	int ret, mode, i;
 	bool need_relocs;
@@ -1117,8 +1109,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE) {
+		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
+		/* Assuming all privileged batches are in the global GTT means
+		 * we need to make sure we have a global gtt offset, as well as
+		 * the PTEs mapped. As mentioned above, we can forego this on
+		 * HSW, but don't.
+		 */
+		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
+						 false);
+		if (ret)
+			goto err;
+
+		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
+			       batch_obj->cache_level,
+			       GLOBAL_BIND);
+
+		exec_start += i915_gem_obj_ggtt_offset(batch_obj);
+	} else
+		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
@@ -1160,8 +1169,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_offset(batch_obj, vm) +
-		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ea40b3..7e4a308 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH] [v6] drm/i915: Use the new vm [un]bind functions
  2013-09-19 14:41                                         ` [PATCH] [v6] " Ben Widawsky
  2013-09-19 14:45                                           ` [PATCH] [v7] " Ben Widawsky
@ 2013-09-19 14:47                                           ` Chris Wilson
  2013-09-19 17:41                                             ` Ben Widawsky
  1 sibling, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-19 14:47 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Thu, Sep 19, 2013 at 07:41:23AM -0700, Ben Widawsky wrote:
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index b26d979..e57837c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  	if (unlikely(IS_GEN6(dev) &&
>  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
>  	    !target_i915_obj->has_global_gtt_mapping)) {
> -		i915_gem_gtt_bind_object(target_i915_obj,
> -					 target_i915_obj->cache_level);
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
> +				 GLOBAL_BIND);

Danger, danger. What address are we binding the vma here to since vm !=
ggtt, and the wa requires that the obj is mapped into the same location
in the ggtt as the vm. That requires pinning during reserve.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] [v6] drm/i915: Use the new vm [un]bind functions
  2013-09-19 14:47                                           ` [PATCH] [v6] " Chris Wilson
@ 2013-09-19 17:41                                             ` Ben Widawsky
  2013-09-19 18:36                                               ` Daniel Vetter
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-19 17:41 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Thu, Sep 19, 2013 at 03:47:50PM +0100, Chris Wilson wrote:
> On Thu, Sep 19, 2013 at 07:41:23AM -0700, Ben Widawsky wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index b26d979..e57837c 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  	if (unlikely(IS_GEN6(dev) &&
> >  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
> >  	    !target_i915_obj->has_global_gtt_mapping)) {
> > -		i915_gem_gtt_bind_object(target_i915_obj,
> > -					 target_i915_obj->cache_level);
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
> > +				 GLOBAL_BIND);
> 
> Danger, danger. What address are we binding the vma here to since vm !=
> ggtt, and the wa requires that the obj is mapped into the same location
> in the ggtt as the vm. That requires pinning during reserve.
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

I thought we've agreed to not support full PPGTT on SNB? If you want an
assertion vm == i915_ggtt, I can do that.
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] [v6] drm/i915: Use the new vm [un]bind functions
  2013-09-19 17:41                                             ` Ben Widawsky
@ 2013-09-19 18:36                                               ` Daniel Vetter
  0 siblings, 0 replies; 55+ messages in thread
From: Daniel Vetter @ 2013-09-19 18:36 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Thu, Sep 19, 2013 at 10:41:19AM -0700, Ben Widawsky wrote:
> On Thu, Sep 19, 2013 at 03:47:50PM +0100, Chris Wilson wrote:
> > On Thu, Sep 19, 2013 at 07:41:23AM -0700, Ben Widawsky wrote:
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index b26d979..e57837c 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -286,8 +286,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  	if (unlikely(IS_GEN6(dev) &&
> > >  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
> > >  	    !target_i915_obj->has_global_gtt_mapping)) {
> > > -		i915_gem_gtt_bind_object(target_i915_obj,
> > > -					 target_i915_obj->cache_level);
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
> > > +				 GLOBAL_BIND);
> > 
> > Danger, danger. What address are we binding the vma here to since vm !=
> > ggtt, and the wa requires that the obj is mapped into the same location
> > in the ggtt as the vm. That requires pinning during reserve.
> > -Chris
> > 
> > -- 
> > Chris Wilson, Intel Open Source Technology Centre
> 
> I thought we've agreed to not support full PPGTT on SNB? If you want an
> assertion vm == i915_ggtt, I can do that.

Yeah, I guess that might clarify things a bit here for this wa.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-19 14:45                                           ` [PATCH] [v7] " Ben Widawsky
@ 2013-09-20  4:06                                             ` Ben Widawsky
  2013-09-20 10:43                                               ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-20  4:06 UTC (permalink / raw)
  To: Chris Wilson, Intel GFX; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

v4: Make the code support the secure dispatch flag, which requires
special handling during execbuf. This was fixed (incorrectly) later in
the series, but having it here earlier in the series should be perfectly
acceptable. (Chris)
Move do_switch over to the new, special ggtt_vma interface.

v5: Don't use a local variable (or assertion) when setting the batch
object to the global GTT during secure dispatch (Chris)

v6: Caclulate the exec offset for the secure case (Bug fix missed on
v4). (Chris)
Remove redundant check for has_global_gtt_mapping, since it is done in
bind_vma.

v7: Remove now unused dev_priv in do_switch
Don't pass the vm to ggtt_offset (error from v6 which I should have
caught before sending).

v8: Assert, and rework the SNB workaround (to make it a bit clearer)
code to make sure the VM can't be anything but the GGTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 15 ++++-----
 drivers/gpu/drm/i915/i915_gem.c            | 39 ++++++++--------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  6 ++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 52 ++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++-------------------------
 5 files changed, 59 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 686a66c..8172eb1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1862,6 +1862,12 @@ int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
+int __must_check
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 int __must_check i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj);
@@ -2085,17 +2091,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 651b91c..4f8dc67 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -43,12 +43,6 @@ static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *o
 static __must_check int
 i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 			       bool readonly);
-static __must_check int
-i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
-			   struct i915_address_space *vm,
-			   unsigned alignment,
-			   bool map_and_fenceable,
-			   bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -2693,12 +2687,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3155,7 +3145,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 /**
  * Finds free space in the GTT aperture and binds the object there.
  */
-static int
+int
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
@@ -3424,7 +3414,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3463,11 +3452,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3795,6 +3781,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3823,20 +3810,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cb3b7e8..2e7416f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -415,8 +415,10 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b26d979..5702a30 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,14 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		/* SNB shall not support full PPGTT. This path can only be taken
+		 * when the VM is the GGTT (aliasing PPGTT is not a real VM, and
+		 * therefore doesn't count).
+		 */
+		BUG_ON(vm != obj_to_ggtt(target_i915_obj));
+		vm->bind_vma(i915_gem_obj_to_ggtt(target_i915_obj),
+			     target_i915_obj->cache_level,
+			     GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +470,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -497,14 +504,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -515,9 +514,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -936,7 +933,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct intel_ring_buffer *ring;
 	struct i915_ctx_hang_stats *hs;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
-	u32 exec_start, exec_len;
+	u32 exec_len, exec_start = args->batch_start_offset;
 	u32 mask, flags;
 	int ret, mode, i;
 	bool need_relocs;
@@ -1117,8 +1114,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE) {
+		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
+		/* Assuming all privileged batches are in the global GTT means
+		 * we need to make sure we have a global gtt offset, as well as
+		 * the PTEs mapped. As mentioned above, we can forego this on
+		 * HSW, but don't.
+		 */
+		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
+						 false);
+		if (ret)
+			goto err;
+
+		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
+			       batch_obj->cache_level,
+			       GLOBAL_BIND);
+
+		exec_start += i915_gem_obj_ggtt_offset(batch_obj);
+	} else
+		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
@@ -1160,8 +1174,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_offset(batch_obj, vm) +
-		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ea40b3..7e4a308 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-20  4:06                                             ` [PATCH] " Ben Widawsky
@ 2013-09-20 10:43                                               ` Chris Wilson
  2013-09-20 13:24                                                 ` Daniel Vetter
                                                                   ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Chris Wilson @ 2013-09-20 10:43 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Thu, Sep 19, 2013 at 09:06:39PM -0700, Ben Widawsky wrote:
> @@ -1117,8 +1114,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
>  	 * hsw should have this fixed, but let's be paranoid and do it
>  	 * unconditionally for now. */
> -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> +	if (flags & I915_DISPATCH_SECURE) {
> +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> +		/* Assuming all privileged batches are in the global GTT means
> +		 * we need to make sure we have a global gtt offset, as well as
> +		 * the PTEs mapped. As mentioned above, we can forego this on
> +		 * HSW, but don't.
> +		 */
> +		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
> +						 false);
> +		if (ret)
> +			goto err;

bind_to_vm() has unwanted side-effects here - notably always allocating
a node and corrupting lists.

Just pin, ggtt->bind_vma, unpin. Hmmm, except that we also need a
move_to_active (as we are not presuming vm == ggtt).

pin, ggtt->bind_vma, move_to_active(ggtt), unpin.

And then hope we have the correct flushes in place for that to be
retired if nothing else is going on with that ggtt.

> +
> +		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
> +			       batch_obj->cache_level,
> +			       GLOBAL_BIND);
> +
> +		exec_start += i915_gem_obj_ggtt_offset(batch_obj);
> +	} else
> +		exec_start += i915_gem_obj_offset(batch_obj, vm);
>  
>  	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
>  	if (ret)

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-20 10:43                                               ` Chris Wilson
@ 2013-09-20 13:24                                                 ` Daniel Vetter
  2013-09-20 13:26                                                   ` Daniel Vetter
  2013-09-20 13:29                                                   ` Chris Wilson
  2013-09-20 20:44                                                 ` Ben Widawsky
  2013-09-22 18:46                                                 ` [PATCH] [v9] " Ben Widawsky
  2 siblings, 2 replies; 55+ messages in thread
From: Daniel Vetter @ 2013-09-20 13:24 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX, Ben Widawsky

On Fri, Sep 20, 2013 at 12:43 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Thu, Sep 19, 2013 at 09:06:39PM -0700, Ben Widawsky wrote:
>> @@ -1117,8 +1114,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>        * batch" bit. Hence we need to pin secure batches into the global gtt.
>>        * hsw should have this fixed, but let's be paranoid and do it
>>        * unconditionally for now. */
>> -     if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
>> -             i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
>> +     if (flags & I915_DISPATCH_SECURE) {
>> +             struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
>> +             /* Assuming all privileged batches are in the global GTT means
>> +              * we need to make sure we have a global gtt offset, as well as
>> +              * the PTEs mapped. As mentioned above, we can forego this on
>> +              * HSW, but don't.
>> +              */
>> +             ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
>> +                                              false);
>> +             if (ret)
>> +                     goto err;
>
> bind_to_vm() has unwanted side-effects here - notably always allocating
> a node and corrupting lists.
>
> Just pin, ggtt->bind_vma, unpin. Hmmm, except that we also need a
> move_to_active (as we are not presuming vm == ggtt).
>
> pin, ggtt->bind_vma, move_to_active(ggtt), unpin.
>
> And then hope we have the correct flushes in place for that to be
> retired if nothing else is going on with that ggtt.

New idea: Can't we make this work in an easier fashion by changing the
vma we look up for the eb lists using the right gtt appropriate for
the batch?

Then (presuming all our code is clear of unnecessary (obj, vm) -> vma
lookups) everything should Just Work, including grabing the gtt
offset. Or am I just dreaming here? Of course a BUG_ON to check that
vma->vm of the batch object points at the global gtt vm if we have a
secure dispatch bb would still be dutiful.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-20 13:24                                                 ` Daniel Vetter
@ 2013-09-20 13:26                                                   ` Daniel Vetter
  2013-09-20 13:29                                                   ` Chris Wilson
  1 sibling, 0 replies; 55+ messages in thread
From: Daniel Vetter @ 2013-09-20 13:26 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX, Ben Widawsky

On Fri, Sep 20, 2013 at 3:24 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Fri, Sep 20, 2013 at 12:43 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> On Thu, Sep 19, 2013 at 09:06:39PM -0700, Ben Widawsky wrote:
>>> @@ -1117,8 +1114,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>>        * batch" bit. Hence we need to pin secure batches into the global gtt.
>>>        * hsw should have this fixed, but let's be paranoid and do it
>>>        * unconditionally for now. */
>>> -     if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
>>> -             i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
>>> +     if (flags & I915_DISPATCH_SECURE) {
>>> +             struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
>>> +             /* Assuming all privileged batches are in the global GTT means
>>> +              * we need to make sure we have a global gtt offset, as well as
>>> +              * the PTEs mapped. As mentioned above, we can forego this on
>>> +              * HSW, but don't.
>>> +              */
>>> +             ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
>>> +                                              false);
>>> +             if (ret)
>>> +                     goto err;
>>
>> bind_to_vm() has unwanted side-effects here - notably always allocating
>> a node and corrupting lists.
>>
>> Just pin, ggtt->bind_vma, unpin. Hmmm, except that we also need a
>> move_to_active (as we are not presuming vm == ggtt).
>>
>> pin, ggtt->bind_vma, move_to_active(ggtt), unpin.
>>
>> And then hope we have the correct flushes in place for that to be
>> retired if nothing else is going on with that ggtt.
>
> New idea: Can't we make this work in an easier fashion by changing the
> vma we look up for the eb lists using the right gtt appropriate for
> the batch?
>
> Then (presuming all our code is clear of unnecessary (obj, vm) -> vma
> lookups) everything should Just Work, including grabing the gtt
> offset. Or am I just dreaming here? Of course a BUG_ON to check that
> vma->vm of the batch object points at the global gtt vm if we have a
> secure dispatch bb would still be dutiful.

Ok, I'm dreaming, or at least it's not that simple: We also need the
ppgtt binding for the non-CS access to the batch bo (like indirect
state and stuff). So could we instead just insert two vmas into the eb
lists, one for the ppgtt and one for the global gtt?

Of course that means we can only tackle this once we do have multiple vms.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-20 13:24                                                 ` Daniel Vetter
  2013-09-20 13:26                                                   ` Daniel Vetter
@ 2013-09-20 13:29                                                   ` Chris Wilson
  1 sibling, 0 replies; 55+ messages in thread
From: Chris Wilson @ 2013-09-20 13:29 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Ben Widawsky, Ben Widawsky

On Fri, Sep 20, 2013 at 03:24:43PM +0200, Daniel Vetter wrote:
> On Fri, Sep 20, 2013 at 12:43 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Thu, Sep 19, 2013 at 09:06:39PM -0700, Ben Widawsky wrote:
> >> @@ -1117,8 +1114,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >>        * batch" bit. Hence we need to pin secure batches into the global gtt.
> >>        * hsw should have this fixed, but let's be paranoid and do it
> >>        * unconditionally for now. */
> >> -     if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> >> -             i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> >> +     if (flags & I915_DISPATCH_SECURE) {
> >> +             struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> >> +             /* Assuming all privileged batches are in the global GTT means
> >> +              * we need to make sure we have a global gtt offset, as well as
> >> +              * the PTEs mapped. As mentioned above, we can forego this on
> >> +              * HSW, but don't.
> >> +              */
> >> +             ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
> >> +                                              false);
> >> +             if (ret)
> >> +                     goto err;
> >
> > bind_to_vm() has unwanted side-effects here - notably always allocating
> > a node and corrupting lists.
> >
> > Just pin, ggtt->bind_vma, unpin. Hmmm, except that we also need a
> > move_to_active (as we are not presuming vm == ggtt).
> >
> > pin, ggtt->bind_vma, move_to_active(ggtt), unpin.
> >
> > And then hope we have the correct flushes in place for that to be
> > retired if nothing else is going on with that ggtt.
> 
> New idea: Can't we make this work in an easier fashion by changing the
> vma we look up for the eb lists using the right gtt appropriate for
> the batch?

Not quite, in the eb it does need to be in the ppgtt for self- or
back-references to the batch bo. It is only the CS that needs the GGTT
entry in addition to the PPGTT entry required for everything else. And
we can quite happily handle having different offsets in each address
space.
 
> Then (presuming all our code is clear of unnecessary (obj, vm) -> vma
> lookups) everything should Just Work, including grabing the gtt
> offset. Or am I just dreaming here? Of course a BUG_ON to check that
> vma->vm of the batch object points at the global gtt vm if we have a
> secure dispatch bb would still be dutiful.

Dream on. :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-20 10:43                                               ` Chris Wilson
  2013-09-20 13:24                                                 ` Daniel Vetter
@ 2013-09-20 20:44                                                 ` Ben Widawsky
  2013-09-20 20:55                                                   ` Chris Wilson
  2013-09-22 18:46                                                 ` [PATCH] [v9] " Ben Widawsky
  2 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-20 20:44 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Fri, Sep 20, 2013 at 11:43:48AM +0100, Chris Wilson wrote:
> On Thu, Sep 19, 2013 at 09:06:39PM -0700, Ben Widawsky wrote:
> > @@ -1117,8 +1114,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> >  	 * hsw should have this fixed, but let's be paranoid and do it
> >  	 * unconditionally for now. */
> > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > +	if (flags & I915_DISPATCH_SECURE) {
> > +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > +		/* Assuming all privileged batches are in the global GTT means
> > +		 * we need to make sure we have a global gtt offset, as well as
> > +		 * the PTEs mapped. As mentioned above, we can forego this on
> > +		 * HSW, but don't.
> > +		 */
> > +		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
> > +						 false);
> > +		if (ret)
> > +			goto err;
> 
> bind_to_vm() has unwanted side-effects here - notably always allocating
> a node and corrupting lists.
> 
> Just pin, ggtt->bind_vma, unpin. Hmmm, except that we also need a
> move_to_active (as we are not presuming vm == ggtt).
> 
> pin, ggtt->bind_vma, move_to_active(ggtt), unpin.
> 
> And then hope we have the correct flushes in place for that to be
> retired if nothing else is going on with that ggtt.

Yes, you're right, and a particular nice catch on the move to active; I
completely forgot. I think ggtt->bind_vma is redundant though. Shouldn't
it just be:
pin, move_to_active, unpin?

Furthermore, the actually pinning (pin count increment) should be
unnecessary, but I assume you were just trying to save me some typing.

> 
> > +
> > +		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
> > +			       batch_obj->cache_level,
> > +			       GLOBAL_BIND);
> > +
> > +		exec_start += i915_gem_obj_ggtt_offset(batch_obj);
> > +	} else
> > +		exec_start += i915_gem_obj_offset(batch_obj, vm);
> >  
> >  	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
> >  	if (ret)
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-20 20:44                                                 ` Ben Widawsky
@ 2013-09-20 20:55                                                   ` Chris Wilson
  2013-09-20 21:08                                                     ` Ben Widawsky
  2013-09-20 21:22                                                     ` Daniel Vetter
  0 siblings, 2 replies; 55+ messages in thread
From: Chris Wilson @ 2013-09-20 20:55 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Fri, Sep 20, 2013 at 01:44:23PM -0700, Ben Widawsky wrote:
> On Fri, Sep 20, 2013 at 11:43:48AM +0100, Chris Wilson wrote:
> > On Thu, Sep 19, 2013 at 09:06:39PM -0700, Ben Widawsky wrote:
> > > @@ -1117,8 +1114,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> > >  	 * hsw should have this fixed, but let's be paranoid and do it
> > >  	 * unconditionally for now. */
> > > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > > +	if (flags & I915_DISPATCH_SECURE) {
> > > +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > > +		/* Assuming all privileged batches are in the global GTT means
> > > +		 * we need to make sure we have a global gtt offset, as well as
> > > +		 * the PTEs mapped. As mentioned above, we can forego this on
> > > +		 * HSW, but don't.
> > > +		 */
> > > +		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
> > > +						 false);
> > > +		if (ret)
> > > +			goto err;
> > 
> > bind_to_vm() has unwanted side-effects here - notably always allocating
> > a node and corrupting lists.
> > 
> > Just pin, ggtt->bind_vma, unpin. Hmmm, except that we also need a
> > move_to_active (as we are not presuming vm == ggtt).
> > 
> > pin, ggtt->bind_vma, move_to_active(ggtt), unpin.
> > 
> > And then hope we have the correct flushes in place for that to be
> > retired if nothing else is going on with that ggtt.
> 
> Yes, you're right, and a particular nice catch on the move to active; I
> completely forgot. I think ggtt->bind_vma is redundant though. Shouldn't
> it just be:
> pin, move_to_active, unpin?

Since we will ask for a !map_and_fenceable pin, pin() will not
automatically bind into the global GTT, so I think we still need the
ggtt->bind_vma().

 
> Furthermore, the actually pinning (pin count increment) should be
> unnecessary, but I assume you were just trying to save me some typing.

Yes, the pin-count adjustments should be unnecessary - but not a huge
burden, and I was thinking it may help in the future as we may want to
explicitly hold the pin until move-to-active for all objects. That
future being where we strive to reduce hold times on struct_mutex.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-20 20:55                                                   ` Chris Wilson
@ 2013-09-20 21:08                                                     ` Ben Widawsky
  2013-09-20 21:22                                                     ` Daniel Vetter
  1 sibling, 0 replies; 55+ messages in thread
From: Ben Widawsky @ 2013-09-20 21:08 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Fri, Sep 20, 2013 at 09:55:51PM +0100, Chris Wilson wrote:
> On Fri, Sep 20, 2013 at 01:44:23PM -0700, Ben Widawsky wrote:
> > On Fri, Sep 20, 2013 at 11:43:48AM +0100, Chris Wilson wrote:
> > > On Thu, Sep 19, 2013 at 09:06:39PM -0700, Ben Widawsky wrote:
> > > > @@ -1117,8 +1114,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> > > >  	 * hsw should have this fixed, but let's be paranoid and do it
> > > >  	 * unconditionally for now. */
> > > > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > > > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > > > +	if (flags & I915_DISPATCH_SECURE) {
> > > > +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > > > +		/* Assuming all privileged batches are in the global GTT means
> > > > +		 * we need to make sure we have a global gtt offset, as well as
> > > > +		 * the PTEs mapped. As mentioned above, we can forego this on
> > > > +		 * HSW, but don't.
> > > > +		 */
> > > > +		ret = i915_gem_object_bind_to_vm(batch_obj, ggtt, 0, false,
> > > > +						 false);
> > > > +		if (ret)
> > > > +			goto err;
> > > 
> > > bind_to_vm() has unwanted side-effects here - notably always allocating
> > > a node and corrupting lists.
> > > 
> > > Just pin, ggtt->bind_vma, unpin. Hmmm, except that we also need a
> > > move_to_active (as we are not presuming vm == ggtt).
> > > 
> > > pin, ggtt->bind_vma, move_to_active(ggtt), unpin.
> > > 
> > > And then hope we have the correct flushes in place for that to be
> > > retired if nothing else is going on with that ggtt.
> > 
> > Yes, you're right, and a particular nice catch on the move to active; I
> > completely forgot. I think ggtt->bind_vma is redundant though. Shouldn't
> > it just be:
> > pin, move_to_active, unpin?
> 
> Since we will ask for a !map_and_fenceable pin, pin() will not
> automatically bind into the global GTT, so I think we still need the
> ggtt->bind_vma().
> 
pin gets passed a VM, which will be GGTT.
>  
> > Furthermore, the actually pinning (pin count increment) should be
> > unnecessary, but I assume you were just trying to save me some typing.
> 
> Yes, the pin-count adjustments should be unnecessary - but not a huge
> burden, and I was thinking it may help in the future as we may want to
> explicitly hold the pin until move-to-active for all objects. That
> future being where we strive to reduce hold times on struct_mutex.
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/i915: Use the new vm [un]bind functions
  2013-09-20 20:55                                                   ` Chris Wilson
  2013-09-20 21:08                                                     ` Ben Widawsky
@ 2013-09-20 21:22                                                     ` Daniel Vetter
  1 sibling, 0 replies; 55+ messages in thread
From: Daniel Vetter @ 2013-09-20 21:22 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Ben Widawsky, Intel GFX

On Fri, Sep 20, 2013 at 09:55:51PM +0100, Chris Wilson wrote:
> On Fri, Sep 20, 2013 at 01:44:23PM -0700, Ben Widawsky wrote:
> > Furthermore, the actually pinning (pin count increment) should be
> > unnecessary, but I assume you were just trying to save me some typing.
> 
> Yes, the pin-count adjustments should be unnecessary - but not a huge
> burden, and I was thinking it may help in the future as we may want to
> explicitly hold the pin until move-to-active for all objects. That
> future being where we strive to reduce hold times on struct_mutex.

My grand plan is that pinning-to-mark-an-object-reserved-for-execbuf will
be replaced by per-object-lock-acquired. By using the owner-tracking of ww
mutexes we'll even get a "you have this already acquired" notice for free.
And then we obviously need to hold the ww mutex lock until we're done
updating the state, so past the move-to-active.

But I haven't worked out a concrete plan for how to get there yet, so
dunno whether sprinkling more pinnning around is a good idea or not. Just
wanted to drop my 2 uninformed cents here ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH] [v9] drm/i915: Use the new vm [un]bind functions
  2013-09-20 10:43                                               ` Chris Wilson
  2013-09-20 13:24                                                 ` Daniel Vetter
  2013-09-20 20:44                                                 ` Ben Widawsky
@ 2013-09-22 18:46                                                 ` Ben Widawsky
  2013-09-23  8:39                                                   ` Chris Wilson
  2 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-22 18:46 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

From: Ben Widawsky <ben@bwidawsk.net>

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

v4: Make the code support the secure dispatch flag, which requires
special handling during execbuf. This was fixed (incorrectly) later in
the series, but having it here earlier in the series should be perfectly
acceptable. (Chris)
Move do_switch over to the new, special ggtt_vma interface.

v5: Don't use a local variable (or assertion) when setting the batch
object to the global GTT during secure dispatch (Chris)

v6: Caclulate the exec offset for the secure case (Bug fix missed on
v4). (Chris)
Remove redundant check for has_global_gtt_mapping, since it is done in
bind_vma.

v7: Remove now unused dev_priv in do_switch
Don't pass the vm to ggtt_offset (error from v6 which I should have
caught before sending).

v8: Assert, and rework the SNB workaround (to make it a bit clearer)
code to make sure the VM can't be anything but the GGTT.

v9: Fixing more bugs which can't exist yet on the behest of Chris. Make
sure that the batch object is properly bound, and added to the global
VM's active list - for when we use non-global VMs. (Chris)

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  9 -----
 drivers/gpu/drm/i915/i915_gem.c            | 33 +++++++----------
 drivers/gpu/drm/i915/i915_gem_context.c    |  6 ++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 59 ++++++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------
 5 files changed, 60 insertions(+), 95 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9995cdb..e8ae8fd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2102,17 +2102,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f6c8b0e..378d4ef 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2693,12 +2693,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3155,7 +3151,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 /**
  * Finds free space in the GTT aperture and binds the object there.
  */
-static int
+int
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
@@ -3424,7 +3420,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3463,11 +3458,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3795,6 +3787,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3823,20 +3816,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 1a877a5..d4eb88a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -422,8 +422,10 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0ce0d47..51dd656 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -286,8 +286,14 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		/* SNB shall not support full PPGTT. This path can only be taken
+		 * when the VM is the GGTT (aliasing PPGTT is not a real VM, and
+		 * therefore doesn't count).
+		 */
+		BUG_ON(vm != obj_to_ggtt(target_i915_obj));
+		vm->bind_vma(i915_gem_obj_to_ggtt(target_i915_obj),
+			     target_i915_obj->cache_level,
+			     GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -464,11 +470,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -497,14 +504,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -515,9 +514,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -936,7 +933,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct intel_ring_buffer *ring;
 	struct i915_ctx_hang_stats *hs;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
-	u32 exec_start, exec_len;
+	u32 exec_len, exec_start = args->batch_start_offset;
 	u32 mask, flags;
 	int ret, mode, i;
 	bool need_relocs;
@@ -1118,8 +1115,32 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE) {
+		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
+		/* Assuming all privileged batches are in the global GTT means
+		 * we need to make sure we have a global gtt offset, as well as
+		 * the PTEs mapped. As mentioned above, we can forego this on
+		 * HSW, but don't.
+		 */
+		ret = i915_gem_obj_ggtt_pin(batch_obj, 0, false, false);
+		if (ret)
+			goto err;
+
+		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
+			       batch_obj->cache_level,
+			       GLOBAL_BIND);
+
+		/* XXX: Since the active list is per VM, we need to make sure
+		 * this VMA ends up on the GGTT's active list to avoid premature
+		 * eviction.
+		 */
+		i915_vma_move_to_active(i915_gem_obj_to_ggtt(batch_obj), ring);
+
+		i915_gem_object_unpin(batch_obj);
+
+		exec_start += i915_gem_obj_ggtt_offset(batch_obj);
+	} else
+		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
@@ -1161,8 +1182,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_offset(batch_obj, vm) +
-		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 65b61d4..e053f14 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH] [v9] drm/i915: Use the new vm [un]bind functions
  2013-09-22 18:46                                                 ` [PATCH] [v9] " Ben Widawsky
@ 2013-09-23  8:39                                                   ` Chris Wilson
  2013-09-23 22:00                                                     ` Ben Widawsky
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2013-09-23  8:39 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Sun, Sep 22, 2013 at 11:46:00AM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben@bwidawsk.net>
> 
> Building on the last patch which created the new function pointers in
> the VM for bind/unbind, here we actually put those new function pointers
> to use.
> 
> Split out as a separate patch to aid in review. I'm fine with squashing
> into the previous patch if people request it.
> 
> v2: Updated to address the smart ggtt which can do aliasing as needed
> Make sure we bind to global gtt when mappable and fenceable. I thought
> we could get away without this initialy, but we cannot.
> 
> v3: Make the global GTT binding explicitly use the ggtt VM for
> bind_vma(). While at it, use the new ggtt_vma helper (Chris)
> 
> v4: Make the code support the secure dispatch flag, which requires
> special handling during execbuf. This was fixed (incorrectly) later in
> the series, but having it here earlier in the series should be perfectly
> acceptable. (Chris)
> Move do_switch over to the new, special ggtt_vma interface.
> 
> v5: Don't use a local variable (or assertion) when setting the batch
> object to the global GTT during secure dispatch (Chris)
> 
> v6: Caclulate the exec offset for the secure case (Bug fix missed on
> v4). (Chris)
> Remove redundant check for has_global_gtt_mapping, since it is done in
> bind_vma.
> 
> v7: Remove now unused dev_priv in do_switch
> Don't pass the vm to ggtt_offset (error from v6 which I should have
> caught before sending).
> 
> v8: Assert, and rework the SNB workaround (to make it a bit clearer)
> code to make sure the VM can't be anything but the GGTT.
> 
> v9: Fixing more bugs which can't exist yet on the behest of Chris. Make
> sure that the batch object is properly bound, and added to the global
> VM's active list - for when we use non-global VMs. (Chris)

Not quite, the patch introduced an outright bug in addition to potential
issue of vm != ggtt.

> CC: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Minor comments inline,
(for the series) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

> @@ -1118,8 +1115,32 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
>  	 * hsw should have this fixed, but let's be paranoid and do it
>  	 * unconditionally for now. */
> -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> +	if (flags & I915_DISPATCH_SECURE) {
> +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);

Please leave whitespace after variable declarations.

> +		/* Assuming all privileged batches are in the global GTT means
> +		 * we need to make sure we have a global gtt offset, as well as
> +		 * the PTEs mapped. As mentioned above, we can forego this on
> +		 * HSW, but don't.
> +		 */
And a line of whitespace here since this is a block comment and not
closely coupled to the next line of code.

> +		ret = i915_gem_obj_ggtt_pin(batch_obj, 0, false, false);
> +		if (ret)
> +			goto err;
> +
> +		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
> +			       batch_obj->cache_level,
> +			       GLOBAL_BIND);
> +
> +		/* XXX: Since the active list is per VM, we need to make sure
> +		 * this VMA ends up on the GGTT's active list to avoid premature
> +		 * eviction.
> +		 */

No XXX required, unless you have a magical plan; the reasoning is sound.

> +		i915_vma_move_to_active(i915_gem_obj_to_ggtt(batch_obj), ring);
> +
> +		i915_gem_object_unpin(batch_obj);

I think this interface violates Rusty's rules (API should be easy to
use but hard to misuse).

  vma = i915_gem_object_pin(batch_obj, ggtt, 0, false, false);
  if (IS_ERR(vm)) {
    ret = PTR_ERR(vm);
    goto err;
  }

  ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND); // this would a flag to a future pin()
  i915_vma_move_to_active(vma, ring);

  exec_start += vma->node.start;
  i915_gem_object_unpin(batch_obj, vma);

What I am stressing here is that the vma->node is only valid whilst the
object is pinned, and that access should be through the vma rather than
the object. However, that interface is a little too idealistic.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] [v9] drm/i915: Use the new vm [un]bind functions
  2013-09-23  8:39                                                   ` Chris Wilson
@ 2013-09-23 22:00                                                     ` Ben Widawsky
  2013-09-23 22:10                                                       ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Ben Widawsky @ 2013-09-23 22:00 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Mon, Sep 23, 2013 at 09:39:31AM +0100, Chris Wilson wrote:
> On Sun, Sep 22, 2013 at 11:46:00AM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <ben@bwidawsk.net>
> > 
> > Building on the last patch which created the new function pointers in
> > the VM for bind/unbind, here we actually put those new function pointers
> > to use.
> > 
> > Split out as a separate patch to aid in review. I'm fine with squashing
> > into the previous patch if people request it.
> > 
> > v2: Updated to address the smart ggtt which can do aliasing as needed
> > Make sure we bind to global gtt when mappable and fenceable. I thought
> > we could get away without this initialy, but we cannot.
> > 
> > v3: Make the global GTT binding explicitly use the ggtt VM for
> > bind_vma(). While at it, use the new ggtt_vma helper (Chris)
> > 
> > v4: Make the code support the secure dispatch flag, which requires
> > special handling during execbuf. This was fixed (incorrectly) later in
> > the series, but having it here earlier in the series should be perfectly
> > acceptable. (Chris)
> > Move do_switch over to the new, special ggtt_vma interface.
> > 
> > v5: Don't use a local variable (or assertion) when setting the batch
> > object to the global GTT during secure dispatch (Chris)
> > 
> > v6: Caclulate the exec offset for the secure case (Bug fix missed on
> > v4). (Chris)
> > Remove redundant check for has_global_gtt_mapping, since it is done in
> > bind_vma.
> > 
> > v7: Remove now unused dev_priv in do_switch
> > Don't pass the vm to ggtt_offset (error from v6 which I should have
> > caught before sending).
> > 
> > v8: Assert, and rework the SNB workaround (to make it a bit clearer)
> > code to make sure the VM can't be anything but the GGTT.
> > 
> > v9: Fixing more bugs which can't exist yet on the behest of Chris. Make
> > sure that the batch object is properly bound, and added to the global
> > VM's active list - for when we use non-global VMs. (Chris)
> 
> Not quite, the patch introduced an outright bug in addition to potential
> issue of vm != ggtt.

Which bug are we talking about again? I'll update the commit message,
but I never saw a bug other than vm != ggtt (and ones I introduced while
trying to make you happy since v3).

> 
> > CC: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Minor comments inline,
> (for the series) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> > @@ -1118,8 +1115,32 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> >  	 * hsw should have this fixed, but let's be paranoid and do it
> >  	 * unconditionally for now. */
> > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > +	if (flags & I915_DISPATCH_SECURE) {
> > +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> 
> Please leave whitespace after variable declarations.
> 
> > +		/* Assuming all privileged batches are in the global GTT means
> > +		 * we need to make sure we have a global gtt offset, as well as
> > +		 * the PTEs mapped. As mentioned above, we can forego this on
> > +		 * HSW, but don't.
> > +		 */
> And a line of whitespace here since this is a block comment and not
> closely coupled to the next line of code.
> 
> > +		ret = i915_gem_obj_ggtt_pin(batch_obj, 0, false, false);
> > +		if (ret)
> > +			goto err;
> > +
> > +		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
> > +			       batch_obj->cache_level,
> > +			       GLOBAL_BIND);
> > +
> > +		/* XXX: Since the active list is per VM, we need to make sure
> > +		 * this VMA ends up on the GGTT's active list to avoid premature
> > +		 * eviction.
> > +		 */
> 
> No XXX required, unless you have a magical plan; the reasoning is sound.

I used the XXX because it is a bit tricky to understand for the casual
observer. Maybe "NB" was more appropriate. Anyway, removed.

> 
> > +		i915_vma_move_to_active(i915_gem_obj_to_ggtt(batch_obj), ring);
> > +
> > +		i915_gem_object_unpin(batch_obj);
> 
> I think this interface violates Rusty's rules (API should be easy to
> use but hard to misuse).
> 
>   vma = i915_gem_object_pin(batch_obj, ggtt, 0, false, false);
>   if (IS_ERR(vm)) {
>     ret = PTR_ERR(vm);
>     goto err;
>   }
> 

You're missing a step here, I assume you mean:
i915_gem_obj_ggtt_pin(...)
vma = i915_gem_obj_to_ggtt(...)
if (IS_ERR)...

Or had you something else in mind?

>   ggtt->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND); // this would a flag to a future pin()
>   i915_vma_move_to_active(vma, ring);
> 
>   exec_start += vma->node.start;
>   i915_gem_object_unpin(batch_obj, vma);
> 
> What I am stressing here is that the vma->node is only valid whilst the
> object is pinned, and that access should be through the vma rather than
> the object. However, that interface is a little too idealistic.
> -Chris
> 

I am fine with this change - though I don't find it personally more
clear in stressing your point. Earlier requests were heavily in favor is
using "ggtt" where it couldn't be a generic VM. I think therefore the
excessive use of i915_gem_obj_to_ggtt makes the code look less generic,
which it is. Ie. I don't see this as abusing the interface.

I got all the other nitpicks. Once you ack the "missing step" I shall
resend.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] [v9] drm/i915: Use the new vm [un]bind functions
  2013-09-23 22:00                                                     ` Ben Widawsky
@ 2013-09-23 22:10                                                       ` Chris Wilson
  0 siblings, 0 replies; 55+ messages in thread
From: Chris Wilson @ 2013-09-23 22:10 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Mon, Sep 23, 2013 at 03:00:06PM -0700, Ben Widawsky wrote:
> > I think this interface violates Rusty's rules (API should be easy to
> > use but hard to misuse).
> > 
> >   vma = i915_gem_object_pin(batch_obj, ggtt, 0, false, false);
> >   if (IS_ERR(vm)) {
> >     ret = PTR_ERR(vm);
> >     goto err;
> >   }
> > 
> 
> You're missing a step here, I assume you mean:
> i915_gem_obj_ggtt_pin(...)
> vma = i915_gem_obj_to_ggtt(...)
> if (IS_ERR)...
> 
> Or had you something else in mind?

I was thinking of making the pin return the vma instead. Then you know
that the vma is valid until its unpin. I think it helps here, but to
cater for all use cases we would need something analagous to get_pages,
pin_pages and unpin_pages. (get_vma, pin_vma, unpin_vma).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2013-09-23 22:10 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-14 22:03 [PATCH 1/6] drm/i915: trace vm eviction instead of everything Ben Widawsky
2013-09-14 22:03 ` [PATCH 2/6] drm/i915: Provide a cheap ggtt vma lookup Ben Widawsky
2013-09-14 22:03 ` [PATCH 3/6] drm/i915: Convert active API to VMA Ben Widawsky
2013-09-14 22:03 ` [PATCH 4/6] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
2013-09-16  9:25   ` Chris Wilson
2013-09-16 18:23     ` Ben Widawsky
2013-09-16 22:05       ` Daniel Vetter
2013-09-16 22:18         ` Chris Wilson
2013-09-16 22:20           ` Daniel Vetter
2013-09-16 22:13       ` Chris Wilson
2013-09-17  5:44         ` Ben Widawsky
2013-09-17  7:49           ` Chris Wilson
2013-09-17 17:00             ` [PATCH 4/6] [v5] " Ben Widawsky
2013-09-14 22:03 ` [PATCH 5/6] drm/i915: Use the new vm [un]bind functions Ben Widawsky
2013-09-16  7:37   ` Chris Wilson
2013-09-16 18:31     ` Ben Widawsky
2013-09-17 17:01     ` [PATCH 5/6] [v3] " Ben Widawsky
2013-09-17 20:55       ` Chris Wilson
2013-09-17 23:14         ` Ben Widawsky
2013-09-17 23:33           ` Chris Wilson
2013-09-17 23:48             ` Ben Widawsky
2013-09-17 23:57               ` Chris Wilson
2013-09-18  0:02                 ` Ben Widawsky
2013-09-18  8:30                   ` Chris Wilson
2013-09-18 14:47                     ` Ben Widawsky
2013-09-18 14:53                       ` Chris Wilson
2013-09-18 15:48                         ` Ben Widawsky
2013-09-18 15:59                           ` Chris Wilson
2013-09-18 16:11                             ` Ben Widawsky
2013-09-18 16:15                               ` Chris Wilson
2013-09-18 16:20                                 ` Daniel Vetter
2013-09-18 16:37                                   ` Ben Widawsky
2013-09-19  0:12                                 ` [PATCH] [v4] " Ben Widawsky
2013-09-19  9:13                                   ` Chris Wilson
2013-09-19 14:15                                     ` [PATCH] [v5] " Ben Widawsky
2013-09-19 14:26                                       ` Chris Wilson
2013-09-19 14:41                                         ` [PATCH] [v6] " Ben Widawsky
2013-09-19 14:45                                           ` [PATCH] [v7] " Ben Widawsky
2013-09-20  4:06                                             ` [PATCH] " Ben Widawsky
2013-09-20 10:43                                               ` Chris Wilson
2013-09-20 13:24                                                 ` Daniel Vetter
2013-09-20 13:26                                                   ` Daniel Vetter
2013-09-20 13:29                                                   ` Chris Wilson
2013-09-20 20:44                                                 ` Ben Widawsky
2013-09-20 20:55                                                   ` Chris Wilson
2013-09-20 21:08                                                     ` Ben Widawsky
2013-09-20 21:22                                                     ` Daniel Vetter
2013-09-22 18:46                                                 ` [PATCH] [v9] " Ben Widawsky
2013-09-23  8:39                                                   ` Chris Wilson
2013-09-23 22:00                                                     ` Ben Widawsky
2013-09-23 22:10                                                       ` Chris Wilson
2013-09-19 14:47                                           ` [PATCH] [v6] " Chris Wilson
2013-09-19 17:41                                             ` Ben Widawsky
2013-09-19 18:36                                               ` Daniel Vetter
2013-09-14 22:03 ` [PATCH 6/6] drm/i915: eliminate vm->insert_entries() Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.