All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Widawsky <benjamin.widawsky@intel.com>
To: Intel GFX <intel-gfx@lists.freedesktop.org>
Cc: Ben Widawsky <ben@bwidawsk.net>
Subject: [PATCH 11/48] drm/i915: Create bind/unbind abstraction for VMAs
Date: Fri,  6 Dec 2013 14:10:56 -0800	[thread overview]
Message-ID: <1386367941-7131-11-git-send-email-benjamin.widawsky@intel.com> (raw)
In-Reply-To: <1386367941-7131-1-git-send-email-benjamin.widawsky@intel.com>

From: Ben Widawsky <ben@bwidawsk.net>

To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.

What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.

drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage

drm/i915: Add bind/unbind object functions to VMA

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding.  This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)

v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.

v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)

v5: Update the comment to not suck (Chris)

v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use the new vm [un]bind functions

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

At this point the original mailing list thread diverges. ie.

v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: reduce vm->insert_entries() usage

FKA: drm/i915: eliminate vm->insert_entries()

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 100 +++++++--------
 drivers/gpu/drm/i915/i915_gem.c            |  61 ++-------
 drivers/gpu/drm/i915/i915_gem_context.c    |   8 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  43 ++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 192 +++++++++++++++++++++++------
 5 files changed, 244 insertions(+), 160 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ce80a72..cc49e83 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -523,6 +523,56 @@ enum i915_cache_level {
 
 typedef uint32_t gen6_gtt_pte_t;
 
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+	struct drm_mm_node node;
+	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
+	struct list_head vma_link; /* Link in the object's VMA list */
+
+	/** This vma's place in the batchbuffer or on the eviction list */
+	struct list_head exec_list;
+
+	/**
+	 * Used for performing relocations during execbuffer insertion.
+	 */
+	struct hlist_node exec_node;
+	unsigned long exec_handle;
+	struct drm_i915_gem_exec_object2 *exec_entry;
+
+	/** How many users have pinned this object in GTT space. The following
+	 * users can each hold at most one reference: pwrite/pread, pin_ioctl
+	 * (via user_pin_count), execbuffer (objects are not allowed multiple
+	 * times for the same batchbuffer), and the framebuffer code. When
+	 * switching/pageflipping, the framebuffer code has at most two buffers
+	 * pinned per crtc.
+	 *
+	 * In the worst case this is 1 + 1 + 1 + 2*2 = 7. That would fit into 3
+	 * bits with absolutely no headroom. So use 4 bits. */
+	unsigned int pin_count:4;
+#define DRM_I915_GEM_OBJECT_MAX_PIN_COUNT 0xf
+
+	/** Unmap an object from an address space. This usually consists of
+	 * setting the valid PTE entries to a reserved scratch page. */
+	void (*unbind_vma)(struct i915_vma *vma);
+	/* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+	void (*bind_vma)(struct i915_vma *vma,
+			 enum i915_cache_level cache_level,
+			 u32 flags);
+};
+
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
@@ -623,47 +673,6 @@ struct i915_hw_ppgtt {
 	int (*enable)(struct drm_device *dev);
 };
 
-/**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
- *
- * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
-	struct drm_mm_node node;
-	struct drm_i915_gem_object *obj;
-	struct i915_address_space *vm;
-
-	/** This object's place on the active/inactive lists */
-	struct list_head mm_list;
-
-	struct list_head vma_link; /* Link in the object's VMA list */
-
-	/** This vma's place in the batchbuffer or on the eviction list */
-	struct list_head exec_list;
-
-	/**
-	 * Used for performing relocations during execbuffer insertion.
-	 */
-	struct hlist_node exec_node;
-	unsigned long exec_handle;
-	struct drm_i915_gem_exec_object2 *exec_entry;
-
-	/** How many users have pinned this object in GTT space. The following
-	 * users can each hold at most one reference: pwrite/pread, pin_ioctl
-	 * (via user_pin_count), execbuffer (objects are not allowed multiple
-	 * times for the same batchbuffer), and the framebuffer code. When
-	 * switching/pageflipping, the framebuffer code has at most two buffers
-	 * pinned per crtc.
-	 *
-	 * In the worst case this is 1 + 1 + 1 + 2*2 = 7. That would fit into 3
-	 * bits with absolutely no headroom. So use 4 bits. */
-	unsigned int pin_count:4;
-#define DRM_I915_GEM_OBJECT_MAX_PIN_COUNT 0xf
-};
-
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
@@ -2238,19 +2247,10 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_check_and_clear_faults(struct drm_device *dev);
 void i915_gem_suspend_gtt_mappings(struct drm_device *dev);
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 66b8dbc..894e2fa 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2743,12 +2743,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 
 	list_del(&vma->mm_list);
@@ -3479,7 +3475,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3518,11 +3513,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->bind_vma(vma, cache_level, 0);
 	}
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link)
@@ -3850,6 +3842,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3878,20 +3871,17 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
 	}
 
-	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma = i915_gem_obj_to_vma(obj, vm);
+
+	vma->bind_vma(vma, obj->cache_level, flags);
 
 	i915_gem_obj_to_vma(obj, vm)->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
@@ -4235,41 +4225,6 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 	return NULL;
 }
 
-static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
-					      struct i915_address_space *vm)
-{
-	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
-	if (vma == NULL)
-		return ERR_PTR(-ENOMEM);
-
-	INIT_LIST_HEAD(&vma->vma_link);
-	INIT_LIST_HEAD(&vma->mm_list);
-	INIT_LIST_HEAD(&vma->exec_list);
-	vma->vm = vm;
-	vma->obj = obj;
-
-	/* Keep GGTT vmas first to make debug easier */
-	if (i915_is_ggtt(vm))
-		list_add(&vma->vma_link, &obj->vma_list);
-	else
-		list_add_tail(&vma->vma_link, &obj->vma_list);
-
-	return vma;
-}
-
-struct i915_vma *
-i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
-				  struct i915_address_space *vm)
-{
-	struct i915_vma *vma;
-
-	vma = i915_gem_obj_to_vma(obj, vm);
-	if (!vma)
-		vma = __i915_gem_vma_create(obj, vm);
-
-	return vma;
-}
-
 void i915_gem_vma_destroy(struct i915_vma *vma)
 {
 	WARN_ON(vma->node.allocated);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 9775d8b..44a84e2 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -406,6 +406,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret, i;
@@ -440,8 +441,11 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
+							   &dev_priv->gtt.base);
+		vma->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 742f920..964390c 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -91,6 +91,7 @@ eb_lookup_vmas(struct eb_vmas *eb,
 	       struct i915_address_space *vm,
 	       struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = vm->dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	struct list_head objects;
 	int i, ret = 0;
@@ -125,6 +126,15 @@ eb_lookup_vmas(struct eb_vmas *eb,
 	i = 0;
 	list_for_each_entry(obj, &objects, obj_exec_link) {
 		struct i915_vma *vma;
+		struct i915_address_space *bind_vm = vm;
+
+		/* If we have secure dispatch, or the userspace assures us that
+		 * they know what they're doing, use the GGTT VM.
+		 */
+		if (exec[i].flags & EXEC_OBJECT_NEEDS_GTT ||
+		    ((args->flags & I915_EXEC_SECURE) &&
+		    (i == (args->buffer_count - 1))))
+			bind_vm = &dev_priv->gtt.base;
 
 		/*
 		 * NOTE: We can leak any vmas created here when something fails
@@ -134,7 +144,7 @@ eb_lookup_vmas(struct eb_vmas *eb,
 		 * from the (obj, vm) we don't run the risk of creating
 		 * duplicated vmas for the same vm.
 		 */
-		vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
+		vma = i915_gem_obj_lookup_or_create_vma(obj, bind_vm);
 		if (IS_ERR(vma)) {
 			DRM_DEBUG("Failed to lookup VMA\n");
 			ret = PTR_ERR(vma);
@@ -347,8 +357,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(target_i915_obj, vm);
+		vma->bind_vma(vma, target_i915_obj->cache_level, GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -522,11 +532,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -555,14 +566,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -573,9 +576,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -1182,8 +1183,14 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	/* snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but bdw mucks it up again. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE &&
+	    !batch_obj->has_global_gtt_mapping) {
+		/* When we have multiple VMs, we'll need to make sure that we
+		 * allocate space first */
+		struct i915_vma *vma = i915_gem_obj_to_ggtt(batch_obj);
+		BUG_ON(!vma);
+		vma->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
+	}
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 81420e1..fc26a85 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -70,6 +70,11 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
 #define PPAT_CACHED_INDEX		_PAGE_PAT /* WB LLCeLLC */
 #define PPAT_DISPLAY_ELLC_INDEX		_PAGE_PCD /* WT eLLC */
 
+static void ppgtt_bind_vma(struct i915_vma *vma,
+			   enum i915_cache_level cache_level,
+			   u32 flags);
+static void ppgtt_unbind_vma(struct i915_vma *vma);
+
 static inline gen8_gtt_pte_t gen8_pte_encode(dma_addr_t addr,
 					     enum i915_cache_level level,
 					     bool valid)
@@ -748,22 +753,26 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
+static void __always_unused
+ppgtt_bind_vma(struct i915_vma *vma,
+	       enum i915_cache_level cache_level,
+	       u32 flags)
 {
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	WARN_ON(flags);
+
+	vma->vm->insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
+static void __always_unused ppgtt_unbind_vma(struct i915_vma *vma)
 {
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT,
-				true);
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	vma->vm->clear_range(vma->vm,
+			     entry,
+			     vma->obj->base.size >> PAGE_SHIFT,
+			     true);
 }
 
 extern int intel_iommu_gfx_mapped;
@@ -865,8 +874,18 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       true);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
+		if (!vma)
+			continue;
+
 		i915_gem_clflush_object(obj, obj->pin_display);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		/* The bind_vma code tries to be smart about tracking mappings.
+		 * Unfortunately above, we've just wiped out the mappings
+		 * without telling our object about it. So we need to fake it.
+		 */
+		obj->has_global_gtt_mapping = 0;
+		vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -1025,16 +1044,18 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 	readl(gtt_base);
 }
 
-static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     unsigned int pg_start,
-				     enum i915_cache_level cache_level)
+
+static void i915_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 unused)
 {
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
 	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
 		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
 
-	intel_gtt_insert_sg_entries(st, pg_start, flags);
-
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	vma->obj->has_global_gtt_mapping = 1;
 }
 
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
@@ -1045,33 +1066,77 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
+static void i915_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned int first = vma->node.start >> PAGE_SHIFT;
+	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	vma->obj->has_global_gtt_mapping = 0;
+	intel_gtt_clear_range(first, size);
+}
+
+static void ggtt_bind_vma(struct i915_vma *vma,
+			  enum i915_cache_level cache_level,
+			  u32 flags)
 {
-	struct drm_device *dev = obj->base.dev;
+	struct drm_device *dev = vma->vm->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
 
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
+	/* If there is no aliasing PPGTT, or the caller needs a global mapping,
+	 * or we have a global mapping already but the cacheability flags have
+	 * changed, set the global PTEs.
+	 *
+	 * If there is an aliasing PPGTT it is anecdotally faster, so use that
+	 * instead if none of the above hold true.
+	 *
+	 * NB: A global mapping should only be needed for special regions like
+	 * "gtt mappable", SNB errata, or if specified via special execbuf
+	 * flags. At all other times, the GPU will use the aliasing PPGTT.
+	 */
+	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
+		if (!obj->has_global_gtt_mapping ||
+		    (cache_level != obj->cache_level)) {
+			vma->vm->insert_entries(vma->vm, obj->pages, entry,
+						cache_level);
+			obj->has_global_gtt_mapping = 1;
+		}
+	}
 
-	obj->has_global_gtt_mapping = 1;
+	if (dev_priv->mm.aliasing_ppgtt &&
+	    (!obj->has_aliasing_ppgtt_mapping ||
+	     (cache_level != obj->cache_level))) {
+		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
+		appgtt->base.insert_entries(&appgtt->base,
+					    vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+	}
 }
 
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
+static void ggtt_unbind_vma(struct i915_vma *vma)
 {
-	struct drm_device *dev = obj->base.dev;
+	struct drm_device *dev = vma->vm->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT,
-				       true);
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	if (obj->has_global_gtt_mapping) {
+		vma->vm->clear_range(vma->vm, entry,
+				     vma->obj->base.size >> PAGE_SHIFT,
+				     true);
+		obj->has_global_gtt_mapping = 0;
+	}
 
-	obj->has_global_gtt_mapping = 0;
+	if (obj->has_aliasing_ppgtt_mapping) {
+		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
+		appgtt->base.clear_range(&appgtt->base,
+					 entry,
+					 obj->base.size >> PAGE_SHIFT,
+					 true);
+		obj->has_aliasing_ppgtt_mapping = 0;
+	}
 }
 
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
@@ -1446,7 +1511,6 @@ static int i915_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
-	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
 
 	return 0;
 }
@@ -1498,3 +1562,57 @@ int i915_gem_gtt_init(struct drm_device *dev)
 
 	return 0;
 }
+
+static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
+					      struct i915_address_space *vm)
+{
+	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
+	if (vma == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->vma_link);
+	INIT_LIST_HEAD(&vma->mm_list);
+	INIT_LIST_HEAD(&vma->exec_list);
+	vma->vm = vm;
+	vma->obj = obj;
+
+	switch (INTEL_INFO(vm->dev)->gen) {
+	case 8:
+	case 7:
+	case 6:
+		vma->unbind_vma = ggtt_unbind_vma;
+		vma->bind_vma = ggtt_bind_vma;
+		break;
+	case 5:
+	case 4:
+	case 3:
+	case 2:
+		BUG_ON(!i915_is_ggtt(vm));
+		vma->unbind_vma = i915_ggtt_unbind_vma;
+		vma->bind_vma = i915_ggtt_bind_vma;
+		break;
+	default:
+		BUG();
+	}
+
+	/* Keep GGTT vmas first to make debug easier */
+	if (i915_is_ggtt(vm))
+		list_add(&vma->vma_link, &obj->vma_list);
+	else
+		list_add_tail(&vma->vma_link, &obj->vma_list);
+
+	return vma;
+}
+
+struct i915_vma *
+i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
+				  struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+
+	vma = i915_gem_obj_to_vma(obj, vm);
+	if (!vma)
+		vma = __i915_gem_vma_create(obj, vm);
+
+	return vma;
+}
-- 
1.8.4.2

  parent reply	other threads:[~2013-12-06 22:13 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-06 21:55 [PULL] PPGTT Ben Widawsky
2013-12-06 22:10 ` [PATCH 01/48] drm/i915: Fix bad refcounting on execbuf failures Ben Widawsky
2013-12-06 22:10   ` [PATCH 02/48] drm/i915: Provide PDP updates via MMIO Ben Widawsky
2013-12-06 22:10   ` [PATCH 03/48] drm/i915: Don't unconditionally try to deref aliasing ppgtt Ben Widawsky
2013-12-18 13:50     ` Daniel Vetter
2013-12-06 22:10   ` [PATCH 04/48] drm/i915: Allow ggtt lookups to not WARN Ben Widawsky
2013-12-06 22:10   ` [PATCH 05/48] drm/i915: Takedown drm_mm on failed gtt setup Ben Widawsky
2013-12-06 22:10   ` [PATCH 06/48] drm/i915: Handle inactivating objects for all VMAs Ben Widawsky
2013-12-06 22:10   ` [PATCH 07/48] drm/i915: Add vm to error BO capture Ben Widawsky
2013-12-06 22:10   ` [PATCH 08/48] drm/i915: Don't use gtt mapping for !gtt error objects Ben Widawsky
2013-12-06 22:10   ` [PATCH 09/48] drm/i915: Identify active VM for batchbuffer capture Ben Widawsky
2013-12-06 22:10   ` [PATCH 10/48] drm/i915: Make pin count per VMA Ben Widawsky
2013-12-06 22:10   ` Ben Widawsky [this message]
2013-12-06 22:10   ` [PATCH 12/48] drm/i915: Remove vm arg from relocate entry Ben Widawsky
2013-12-06 22:10   ` [PATCH 13/48] drm/i915: Add a context open function Ben Widawsky
2013-12-06 22:10   ` [PATCH 14/48] drm/i915: relax context alignment Ben Widawsky
2013-12-06 22:11   ` [PATCH 15/48] drm/i915: Simplify ring handling in execbuf Ben Widawsky
2013-12-06 22:11   ` [PATCH 16/48] drm/i915: Permit contexts on all rings Ben Widawsky
2013-12-06 22:11   ` [PATCH 17/48] drm/i915: Track which ring a context ran on Ben Widawsky
2014-04-18  9:51     ` Chris Wilson
2014-04-22 14:25       ` Daniel Vetter
2014-04-22 14:54         ` Chris Wilson
2013-12-06 22:11   ` [PATCH 18/48] drm/i915: Better reset handling for contexts Ben Widawsky
2013-12-18 14:21     ` Daniel Vetter
2013-12-06 22:11   ` [PATCH 19/48] drm/i915: Split context enabling from init Ben Widawsky
2013-12-06 22:11   ` [PATCH 20/48] drm/i915: Generalize default context setup Ben Widawsky
2013-12-06 22:11   ` [PATCH 21/48] drm/i915: PPGTT vfuncs should take a ppgtt argument Ben Widawsky
2013-12-06 22:11   ` [PATCH 22/48] drm/i915: Use drm_mm for PPGTT PDEs Ben Widawsky
2013-12-06 22:11   ` [PATCH 23/48] drm/i915: One hopeful eviction on PPGTT alloc Ben Widawsky
2013-12-06 22:11   ` [PATCH 24/48] drm/i915: Use platform specific ppgtt enable Ben Widawsky
2013-12-06 22:11   ` [PATCH 25/48] drm/i915: Extract mm switching to function Ben Widawsky
2013-12-06 22:11   ` [PATCH 26/48] drm/i915: Use LRI for switching PP_DIR_BASE Ben Widawsky
2013-12-06 22:11   ` [PATCH 27/48] drm/i915: Flush TLBs after !RCS PP_DIR_BASE Ben Widawsky
2013-12-06 22:11   ` [PATCH 28/48] drm/i915: Generalize PPGTT init Ben Widawsky
2013-12-06 22:11   ` [PATCH 29/48] drm/i915: Reorganize intel_enable_ppgtt Ben Widawsky
2013-12-06 22:11   ` [PATCH 30/48] drm/i915: Add VM to context Ben Widawsky
2013-12-06 22:11   ` [PATCH 31/48] drm/i915: Write PDEs at init instead of enable Ben Widawsky
2013-12-06 22:11   ` [PATCH 32/48] drm/i915: Restore PDEs for all VMs Ben Widawsky
2013-12-06 22:11   ` [PATCH 33/48] drm/i915: Do aliasing PPGTT init with contexts Ben Widawsky
2013-12-06 22:11   ` [PATCH 34/48] drm/i915: Create a per file_priv default context Ben Widawsky
2013-12-06 22:11   ` [PATCH 35/48] drm/i915: Piggy back hangstats off of contexts Ben Widawsky
2013-12-06 22:11   ` [PATCH 36/48] drm/i915: Get context early in execbuf Ben Widawsky
2013-12-06 22:11   ` [PATCH 37/48] drm/i915: Defer request freeing Ben Widawsky
2013-12-12 11:08     ` Chris Wilson
2013-12-18 14:39       ` Daniel Vetter
2013-12-06 22:11   ` [PATCH 38/48] drm/i915: Clean up VMAs before freeing Ben Widawsky
2013-12-18 14:55     ` Daniel Vetter
2013-12-18 14:56       ` Daniel Vetter
2013-12-06 22:11   ` [PATCH 39/48] drm/i915: Do not allow buffers at offset 0 Ben Widawsky
2013-12-12 10:59     ` Chris Wilson
2013-12-18 14:58       ` Daniel Vetter
2013-12-06 22:11   ` [PATCH 40/48] drm/i915: Add a tracepoint for new VMs Ben Widawsky
2013-12-18 14:59     ` Daniel Vetter
2013-12-06 22:11   ` [PATCH 41/48] drm/i915: Use multiple VMs -- the point of no return Ben Widawsky
2013-12-06 22:11   ` [PATCH 42/48] drm/i915: Remove extraneous mm_switch in ppgtt enable Ben Widawsky
2013-12-06 22:11   ` [PATCH 43/48] drm/i915: Warn on gem_pin usage Ben Widawsky
2013-12-18 15:25     ` Daniel Vetter
2013-12-06 22:11   ` [PATCH 44/48] drm/i915: Add PPGTT dumper Ben Widawsky
2013-12-06 22:11   ` [PATCH 45/48] drm/i915: Dump all ppgtt Ben Widawsky
2013-12-18 15:26     ` Daniel Vetter
2013-12-06 22:11   ` [PATCH 46/48] drm: Optionally create mm blocks from top-to-bottom Ben Widawsky
2013-12-08 14:28     ` David Herrmann
2013-12-06 22:11   ` [PATCH 47/48] drm/i915: Use topdown allocation for PPGTT Ben Widawsky
2013-12-06 22:11   ` [PATCH 48/48] page allocator: Tmp OOM deadlock w/a from Chris Ben Widawsky
2013-12-06 22:11   ` [PATCH 01/48] drm/i915: Fix bad refcounting on execbuf failures Ben Widawsky
2013-12-06 22:11   ` [PATCH 02/48] drm/i915: Provide PDP updates via MMIO Ben Widawsky
2013-12-06 22:11   ` [PATCH 03/48] drm/i915: Don't unconditionally try to deref aliasing ppgtt Ben Widawsky
2013-12-06 22:11   ` [PATCH 04/48] drm/i915: Allow ggtt lookups to not WARN Ben Widawsky
2013-12-06 22:11   ` [PATCH 05/48] drm/i915: Takedown drm_mm on failed gtt setup Ben Widawsky
2013-12-06 22:11   ` [PATCH 06/48] drm/i915: Handle inactivating objects for all VMAs Ben Widawsky
2013-12-06 22:11   ` [PATCH 07/48] drm/i915: Add vm to error BO capture Ben Widawsky
2013-12-06 22:11   ` [PATCH 08/48] drm/i915: Don't use gtt mapping for !gtt error objects Ben Widawsky
2013-12-06 22:11   ` [PATCH 09/48] drm/i915: Identify active VM for batchbuffer capture Ben Widawsky
2013-12-06 22:11   ` [PATCH 10/48] drm/i915: Make pin count per VMA Ben Widawsky
2013-12-06 22:11   ` [PATCH 11/48] drm/i915: Create bind/unbind abstraction for VMAs Ben Widawsky
2013-12-06 22:11   ` [PATCH 12/48] drm/i915: Remove vm arg from relocate entry Ben Widawsky
2013-12-06 22:11   ` [PATCH 13/48] drm/i915: Add a context open function Ben Widawsky
2013-12-06 22:11   ` [PATCH 14/48] drm/i915: relax context alignment Ben Widawsky
2013-12-06 22:11   ` [PATCH 15/48] drm/i915: Simplify ring handling in execbuf Ben Widawsky
2013-12-06 22:11   ` [PATCH 16/48] drm/i915: Permit contexts on all rings Ben Widawsky
2013-12-06 22:11   ` [PATCH 17/48] drm/i915: Track which ring a context ran on Ben Widawsky
2013-12-06 22:11   ` [PATCH 18/48] drm/i915: Better reset handling for contexts Ben Widawsky
2013-12-06 22:11   ` [PATCH 19/48] drm/i915: Split context enabling from init Ben Widawsky
2013-12-06 22:11   ` [PATCH 20/48] drm/i915: Generalize default context setup Ben Widawsky
2013-12-06 22:11   ` [PATCH 21/48] drm/i915: PPGTT vfuncs should take a ppgtt argument Ben Widawsky
2013-12-06 22:11   ` [PATCH 22/48] drm/i915: Use drm_mm for PPGTT PDEs Ben Widawsky
2014-03-20 11:10     ` Chris Wilson
2014-03-24 19:36       ` Ben Widawsky
2014-03-24 19:45         ` Chris Wilson
2014-03-24 20:02           ` Ben Widawsky
2014-03-25 13:41             ` Chris Wilson
2014-03-25 15:33               ` Daniel Vetter
2013-12-06 22:11   ` [PATCH 23/48] drm/i915: One hopeful eviction on PPGTT alloc Ben Widawsky
2014-03-20 11:12     ` Chris Wilson
2013-12-06 22:11   ` [PATCH 24/48] drm/i915: Use platform specific ppgtt enable Ben Widawsky
2013-12-06 22:11   ` [PATCH 25/48] drm/i915: Extract mm switching to function Ben Widawsky
2013-12-06 22:11   ` [PATCH 26/48] drm/i915: Use LRI for switching PP_DIR_BASE Ben Widawsky
2013-12-06 22:12   ` [PATCH 27/48] drm/i915: Flush TLBs after !RCS PP_DIR_BASE Ben Widawsky
2013-12-06 22:12   ` [PATCH 28/48] drm/i915: Generalize PPGTT init Ben Widawsky
2013-12-06 22:12   ` [PATCH 29/48] drm/i915: Reorganize intel_enable_ppgtt Ben Widawsky
2013-12-06 22:12   ` [PATCH 30/48] drm/i915: Add VM to context Ben Widawsky
2013-12-06 22:12   ` [PATCH 31/48] drm/i915: Write PDEs at init instead of enable Ben Widawsky
2013-12-06 22:12   ` [PATCH 32/48] drm/i915: Restore PDEs for all VMs Ben Widawsky
2013-12-06 22:12   ` [PATCH 33/48] drm/i915: Do aliasing PPGTT init with contexts Ben Widawsky
2013-12-06 22:12   ` [PATCH 34/48] drm/i915: Create a per file_priv default context Ben Widawsky
2013-12-06 22:12   ` [PATCH 35/48] drm/i915: Piggy back hangstats off of contexts Ben Widawsky
2013-12-06 22:12   ` [PATCH 36/48] drm/i915: Get context early in execbuf Ben Widawsky
2013-12-06 22:12   ` [PATCH 37/48] drm/i915: Defer request freeing Ben Widawsky
2013-12-06 22:12   ` [PATCH 38/48] drm/i915: Clean up VMAs before freeing Ben Widawsky
2013-12-06 22:12   ` [PATCH 39/48] drm/i915: Do not allow buffers at offset 0 Ben Widawsky
2013-12-06 22:12   ` [PATCH 40/48] drm/i915: Add a tracepoint for new VMs Ben Widawsky
2013-12-06 22:12   ` [PATCH 41/48] drm/i915: Use multiple VMs -- the point of no return Ben Widawsky
2013-12-06 22:12   ` [PATCH 42/48] drm/i915: Remove extraneous mm_switch in ppgtt enable Ben Widawsky
2013-12-06 22:12   ` [PATCH 43/48] drm/i915: Warn on gem_pin usage Ben Widawsky
2013-12-06 22:12   ` [PATCH 44/48] drm/i915: Add PPGTT dumper Ben Widawsky
2013-12-06 22:12   ` [PATCH 45/48] drm/i915: Dump all ppgtt Ben Widawsky
2013-12-06 22:12   ` [PATCH 46/48] drm: Optionally create mm blocks from top-to-bottom Ben Widawsky
2013-12-06 22:12   ` [PATCH 47/48] drm/i915: Use topdown allocation for PPGTT Ben Widawsky
2013-12-06 22:12   ` [PATCH 48/48] page allocator: Tmp OOM deadlock w/a from Chris Ben Widawsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1386367941-7131-11-git-send-email-benjamin.widawsky@intel.com \
    --to=benjamin.widawsky@intel.com \
    --cc=ben@bwidawsk.net \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.