All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/29] Completion of i915 VMAs v2
@ 2013-07-31 23:59 Ben Widawsky
  2013-07-31 23:59 ` [PATCH 01/29] drm/i915: Create an init vm Ben Widawsky
                   ` (28 more replies)
  0 siblings, 29 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-07-31 23:59 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Sliced and diced the mega patch into tiny little pieces on the request of
Daniel. Overall, I think it's a big improvement (TBD if it was worth the time
and effort though)

Here I also drop vma->active and leave obj->active; also a request from Daniel.

Finally, I moved the virtual function stuff to the end of the series. I think
Daniel requested that one too, but I can't remember.

Odds are with all this rebasing, I introduced new bugs. I've been a bit too
preoccupied to check each patch thoroughly - but the end result is right, and
works.

Ben Widawsky (29):
  drm/i915: Create an init vm
  drm/i915: Rework drop caches for checkpatch
  drm/i915: Make proper functions for VMs
  drm/i915: Use bound list for inactive shrink
  drm/i915: Add VM to pin
  drm/i915: Use ggtt_vm to save some typing
  drm/i915: Update describe_obj
  drm/i915: Rework __i915_gem_shrink
  drm/i915: thread address space through execbuf
  drm/i915: make caching operate on all address spaces
  drm/i915: BUG_ON put_pages later
  drm/i915: make reset&hangcheck code VM aware
  drm/i915: clear domains for all objects on reset
  drm/i915: Restore PDEs on gtt restore
  drm/i915: Improve VMA comments
  drm/i915: Cleanup more of VMA in destroy
  drm/i915: plumb VM into bind/unbind code
  drm/i915: Use new bind/unbind in eviction code
  drm/i915: turn bound_ggtt checks to bound_any
  drm/i915: Fix up map and fenceable for VMA
  drm/i915: mm_list is per VMA
  drm/i915: Update error capture for VMs
  drm/i915: Add vma to list at creation
  drm/i915: create vmas at execbuf
  drm/i915: Convert execbuf code to use vmas
  drm/i915: Convert active API to VMA
  drm/i915: Add bind/unbind object functions to VM
  drm/i915: Use the new vm [un]bind functions
  drm/i915: eliminate vm->insert_entries()

 drivers/gpu/drm/i915/i915_debugfs.c        |  68 +++--
 drivers/gpu/drm/i915/i915_dma.c            |   4 -
 drivers/gpu/drm/i915/i915_drv.h            | 185 +++++++------
 drivers/gpu/drm/i915/i915_gem.c            | 420 ++++++++++++++++++++---------
 drivers/gpu/drm/i915/i915_gem_context.c    |  17 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  78 +++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 360 ++++++++++++++-----------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 138 ++++++----
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  10 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |   9 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      | 111 +++++---
 drivers/gpu/drm/i915/i915_trace.h          |  37 +--
 drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
 drivers/gpu/drm/i915/intel_pm.c            |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |   8 +-
 15 files changed, 904 insertions(+), 545 deletions(-)

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH 01/29] drm/i915: Create an init vm
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
@ 2013-07-31 23:59 ` Ben Widawsky
  2013-07-31 23:59 ` [PATCH 02/29] drm/i915: Rework drop caches for checkpatch Ben Widawsky
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-07-31 23:59 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Move all the similar address space (VM) initialization code to one
function. Until we have multiple VMs, there should only ever be 1 VM.
The aliasing ppgtt is a special case without it's own VM (since it
doesn't need it's own address space management).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c |  4 ----
 drivers/gpu/drm/i915/i915_gem.c | 15 +++++++++++++--
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 15ed8f5..488f44b 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	i915_dump_device_info(dev_priv);
 
-	INIT_LIST_HEAD(&dev_priv->vm_list);
-	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
-	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
-
 	if (i915_get_bridge_dev(dev)) {
 		ret = -EIO;
 		goto free_priv;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2283765..5ca8ee6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4319,6 +4319,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
 	INIT_LIST_HEAD(&ring->request_list);
 }
 
+static void i915_init_vm(struct drm_i915_private *dev_priv,
+			 struct i915_address_space *vm)
+{
+	vm->dev = dev_priv->dev;
+	INIT_LIST_HEAD(&vm->active_list);
+	INIT_LIST_HEAD(&vm->inactive_list);
+	INIT_LIST_HEAD(&vm->global_link);
+	list_add(&vm->global_link, &dev_priv->vm_list);
+}
+
 void
 i915_gem_load(struct drm_device *dev)
 {
@@ -4331,8 +4341,9 @@ i915_gem_load(struct drm_device *dev)
 				  SLAB_HWCACHE_ALIGN,
 				  NULL);
 
-	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
-	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
+	INIT_LIST_HEAD(&dev_priv->vm_list);
+	i915_init_vm(dev_priv, &dev_priv->gtt.base);
+
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 02/29] drm/i915: Rework drop caches for checkpatch
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
  2013-07-31 23:59 ` [PATCH 01/29] drm/i915: Create an init vm Ben Widawsky
@ 2013-07-31 23:59 ` Ben Widawsky
  2013-08-03 11:32   ` Chris Wilson
  2013-07-31 23:59 ` [PATCH 03/29] drm/i915: Make proper functions for VMs Ben Widawsky
                   ` (26 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-07-31 23:59 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With an upcoming change to bind, to make checkpatch happy and keep the
code clean, we need to rework this code a bit.

This should have no functional impact.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index be69807..61ffa71 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1781,12 +1781,13 @@ i915_drop_caches_set(void *data, u64 val)
 
 	if (val & DROP_BOUND) {
 		list_for_each_entry_safe(obj, next, &vm->inactive_list,
-					 mm_list)
-			if (obj->pin_count == 0) {
-				ret = i915_gem_object_unbind(obj);
-				if (ret)
-					goto unlock;
-			}
+					 mm_list) {
+			if (obj->pin_count)
+				continue;
+			ret = i915_gem_object_unbind(obj);
+			if (ret)
+				goto unlock;
+		}
 	}
 
 	if (val & DROP_UNBOUND) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 03/29] drm/i915: Make proper functions for VMs
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
  2013-07-31 23:59 ` [PATCH 01/29] drm/i915: Create an init vm Ben Widawsky
  2013-07-31 23:59 ` [PATCH 02/29] drm/i915: Rework drop caches for checkpatch Ben Widawsky
@ 2013-07-31 23:59 ` Ben Widawsky
  2013-07-31 23:59 ` [PATCH 04/29] drm/i915: Use bound list for inactive shrink Ben Widawsky
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-07-31 23:59 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Earlier in the conversion sequence we attempted to quickly wedge in the
transitional interface as static inlines.

Now that we're sure these interfaces are sane, for easier debug and to
decrease code size (since many of these functions may be called quite a
bit), make them real functions

While at it, kill off the set_color interface. We'll always have the
VMA, or easily get to it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h       | 83 ++++++++++++++++-------------------
 drivers/gpu/drm/i915/i915_gem.c       | 78 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_evict.c |  8 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  2 +-
 4 files changed, 118 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8b3167e..baefb5c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1379,52 +1379,6 @@ struct drm_i915_gem_object {
 
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
-/* This is a temporary define to help transition us to real VMAs. If you see
- * this, you're either reviewing code, or bisecting it. */
-static inline struct i915_vma *
-__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
-{
-	if (list_empty(&obj->vma_list))
-		return NULL;
-	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
-}
-
-/* Whether or not this object is currently mapped by the translation tables */
-static inline bool
-i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
-{
-	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
-	if (vma == NULL)
-		return false;
-	return drm_mm_node_allocated(&vma->node);
-}
-
-/* Offset of the first PTE pointing to this object */
-static inline unsigned long
-i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
-{
-	BUG_ON(list_empty(&o->vma_list));
-	return __i915_gem_obj_to_vma(o)->node.start;
-}
-
-/* The size used in the translation tables may be larger than the actual size of
- * the object on GEN2/GEN3 because of the way tiling is handled. See
- * i915_gem_get_gtt_size() for more details.
- */
-static inline unsigned long
-i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
-{
-	BUG_ON(list_empty(&o->vma_list));
-	return __i915_gem_obj_to_vma(o)->node.size;
-}
-
-static inline void
-i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
-			    enum i915_cache_level color)
-{
-	__i915_gem_obj_to_vma(o)->node.color = color;
-}
-
 /**
  * Request queue structure.
  *
@@ -1886,6 +1840,43 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
 
 void i915_gem_restore_fences(struct drm_device *dev);
 
+unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
+				  struct i915_address_space *vm);
+bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
+bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
+			struct i915_address_space *vm);
+unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
+				struct i915_address_space *vm);
+struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm);
+/* Some GGTT VM helpers */
+#define obj_to_ggtt(obj) \
+	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
+static inline bool i915_is_ggtt(struct i915_address_space *vm)
+{
+	struct i915_address_space *ggtt =
+		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
+	return vm == ggtt;
+}
+
+static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
+}
+
+static inline unsigned long
+i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
+}
+
+static inline unsigned long
+i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
+}
+#undef obj_to_ggtt
+
 /* i915_gem_context.c */
 void i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5ca8ee6..586172a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2617,7 +2617,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
-	vma = __i915_gem_obj_to_vma(obj);
+	vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
 	list_del(&vma->vma_link);
 	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
@@ -3303,7 +3303,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+	struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
 	int ret;
 
 	if (obj->cache_level == cache_level)
@@ -3343,7 +3343,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
 					       obj, cache_level);
 
-		i915_gem_obj_ggtt_set_color(obj, cache_level);
+		i915_gem_obj_to_vma(obj, &dev_priv->gtt.base)->node.color = cache_level;
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -4651,3 +4651,75 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 		mutex_unlock(&dev->struct_mutex);
 	return cnt;
 }
+
+/* All the new VM stuff */
+unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
+				  struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_vma *vma;
+
+	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
+		vm = &dev_priv->gtt.base;
+
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm)
+			return vma->node.start;
+
+	}
+	return -1;
+}
+
+bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
+			struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+
+	list_for_each_entry(vma, &o->vma_list, vma_link)
+		if (vma->vm == vm)
+			return true;
+
+	return false;
+}
+
+bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_address_space *vm;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		if (i915_gem_obj_bound(o, vm))
+			return true;
+
+	return false;
+}
+
+unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
+				struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_vma *vma;
+
+	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
+		vm = &dev_priv->gtt.base;
+
+	BUG_ON(list_empty(&o->vma_list));
+
+	list_for_each_entry(vma, &o->vma_list, vma_link)
+		if (vma->vm == vm)
+			return vma->node.size;
+
+	return 0;
+}
+
+struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+	list_for_each_entry(vma, &obj->vma_list, vma_link)
+		if (vma->vm == vm)
+			return vma;
+
+	return NULL;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index df61f33..33d85a4 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -34,7 +34,9 @@
 static bool
 mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
 {
-	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
 
 	if (obj->pin_count)
 		return false;
@@ -109,7 +111,7 @@ none:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = __i915_gem_obj_to_vma(obj);
+		vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
 		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
@@ -130,7 +132,7 @@ found:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = __i915_gem_obj_to_vma(obj);
+		vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
 		if (drm_mm_scan_remove_block(&vma->node)) {
 			list_move(&obj->exec_list, &eviction_list);
 			drm_gem_object_reference(&obj->base);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3b639a9..d88e1c9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -662,7 +662,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
 		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
 			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 04/29] drm/i915: Use bound list for inactive shrink
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (2 preceding siblings ...)
  2013-07-31 23:59 ` [PATCH 03/29] drm/i915: Make proper functions for VMs Ben Widawsky
@ 2013-07-31 23:59 ` Ben Widawsky
  2013-07-31 23:59 ` [PATCH 05/29] drm/i915: Add VM to pin Ben Widawsky
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-07-31 23:59 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Do to the move active/inactive lists, it no longer makes sense to use
them for shrinking, since shrinking isn't VM specific (such a need may
also exist, but doesn't yet).

What we can do instead is use the global bound list to find all objects
which aren't active.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 586172a..6f08303 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4614,7 +4614,6 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 			     struct drm_i915_private,
 			     mm.inactive_shrinker);
 	struct drm_device *dev = dev_priv->dev;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	int nr_to_scan = sc->nr_to_scan;
 	bool unlock = true;
@@ -4643,9 +4642,14 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
 		if (obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
-	list_for_each_entry(obj, &vm->inactive_list, mm_list)
+
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		if (obj->active)
+			continue;
+
 		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
+	}
 
 	if (unlock)
 		mutex_unlock(&dev->struct_mutex);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 05/29] drm/i915: Add VM to pin
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (3 preceding siblings ...)
  2013-07-31 23:59 ` [PATCH 04/29] drm/i915: Use bound list for inactive shrink Ben Widawsky
@ 2013-07-31 23:59 ` Ben Widawsky
  2013-07-31 23:59 ` [PATCH 06/29] drm/i915: Use ggtt_vm to save some typing Ben Widawsky
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-07-31 23:59 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

To verbalize it, one can say, "pin an object into the given address
space." The semantics of pinning remain the same otherwise.

Certain objects will always have to be bound into the global GTT.
Therefore, global GTT is a special case, and keep a special interface
around for it (i915_gem_obj_ggtt_pin).

v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 11 +++++++++++
 drivers/gpu/drm/i915/i915_gem.c            |  9 +++++----
 drivers/gpu/drm/i915/i915_gem_context.c    |  4 ++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 +++-
 drivers/gpu/drm/i915/intel_overlay.c       |  2 +-
 drivers/gpu/drm/i915/intel_pm.c            |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  8 ++++----
 7 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index baefb5c..2579e96 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1690,6 +1690,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 void i915_gem_vma_destroy(struct i915_vma *vma);
 
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
@@ -1875,6 +1876,16 @@ i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
 {
 	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
 }
+
+static inline int __must_check
+i915_gem_obj_ggtt_pin(struct drm_i915_gem_object *obj,
+		      uint32_t alignment,
+		      bool map_and_fenceable,
+		      bool nonblocking)
+{
+	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
+				   map_and_fenceable, nonblocking);
+}
 #undef obj_to_ggtt
 
 /* i915_gem_context.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6f08303..3aaf875 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -578,7 +578,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	char __user *user_data;
 	int page_offset, page_length, ret;
 
-	ret = i915_gem_object_pin(obj, 0, true, true);
+	ret = i915_gem_obj_ggtt_pin(obj, 0, true, true);
 	if (ret)
 		goto out;
 
@@ -1332,7 +1332,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	}
 
 	/* Now bind it into the GTT if needed */
-	ret = i915_gem_object_pin(obj, 0, true, false);
+	ret = i915_gem_obj_ggtt_pin(obj,  0, true, false);
 	if (ret)
 		goto unlock;
 
@@ -3472,7 +3472,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * (e.g. libkms for the bootup splash), we have to ensure that we
 	 * always use map_and_fenceable for all scanout buffers.
 	 */
-	ret = i915_gem_object_pin(obj, alignment, true, false);
+	ret = i915_gem_obj_ggtt_pin(obj, alignment, true, false);
 	if (ret)
 		return ret;
 
@@ -3615,6 +3615,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 int
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
+		    struct i915_address_space *vm,
 		    uint32_t alignment,
 		    bool map_and_fenceable,
 		    bool nonblocking)
@@ -3704,7 +3705,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	}
 
 	if (obj->user_pin_count == 0) {
-		ret = i915_gem_object_pin(obj, args->alignment, true, false);
+		ret = i915_gem_obj_ggtt_pin(obj, args->alignment, true, false);
 		if (ret)
 			goto out;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2470206..d1cb28c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -214,7 +214,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 	 * default context.
 	 */
 	dev_priv->ring[RCS].default_context = ctx;
-	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
+	ret = i915_gem_obj_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
 	if (ret) {
 		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
 		goto err_destroy;
@@ -400,7 +400,7 @@ static int do_switch(struct i915_hw_context *to)
 	if (from == to)
 		return 0;
 
-	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
+	ret = i915_gem_obj_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 1734825..4c8d20f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -409,7 +409,9 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->tiling_mode != I915_TILING_NONE;
 	need_mappable = need_fence || need_reloc_mappable(obj);
 
-	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
+	/* FIXME: vm plubming */
+	ret = i915_gem_object_pin(obj, &dev_priv->gtt.base, entry->alignment,
+				  need_mappable, false);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 2abb53e..6d1e1bb 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
 		}
 		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
 	} else {
-		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
+		ret = i915_gem_obj_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
 		if (ret) {
 			DRM_ERROR("failed to pin overlay register bo\n");
 			goto out_free_bo;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 008e0e0..5e3c1ed 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev)
 		return NULL;
 	}
 
-	ret = i915_gem_object_pin(ctx, 4096, true, false);
+	ret = i915_gem_obj_ggtt_pin(ctx, 4096, true, false);
 	if (ret) {
 		DRM_ERROR("failed to pin power context: %d\n", ret);
 		goto err_unref;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8527ea0..74d02a7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -501,7 +501,7 @@ init_pipe_control(struct intel_ring_buffer *ring)
 
 	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true, false);
+	ret = i915_gem_obj_ggtt_pin(obj, 4096, true, false);
 	if (ret)
 		goto err_unref;
 
@@ -1224,7 +1224,7 @@ static int init_status_page(struct intel_ring_buffer *ring)
 
 	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true, false);
+	ret = i915_gem_obj_ggtt_pin(obj, 4096, true, false);
 	if (ret != 0) {
 		goto err_unref;
 	}
@@ -1307,7 +1307,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 
 	ring->obj = obj;
 
-	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
+	ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, true, false);
 	if (ret)
 		goto err_unref;
 
@@ -1828,7 +1828,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 			return -ENOMEM;
 		}
 
-		ret = i915_gem_object_pin(obj, 0, true, false);
+		ret = i915_gem_obj_ggtt_pin(obj, 0, true, false);
 		if (ret != 0) {
 			drm_gem_object_unreference(&obj->base);
 			DRM_ERROR("Failed to ping batch bo\n");
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 06/29] drm/i915: Use ggtt_vm to save some typing
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (4 preceding siblings ...)
  2013-07-31 23:59 ` [PATCH 05/29] drm/i915: Add VM to pin Ben Widawsky
@ 2013-07-31 23:59 ` Ben Widawsky
  2013-08-01  0:00 ` [PATCH 07/29] drm/i915: Update describe_obj Ben Widawsky
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-07-31 23:59 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Just some small cleanups, and a rename of vm->ggtt_vm requested by
Daniel.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c    | 19 ++++++++-----------
 drivers/gpu/drm/i915/i915_gem_stolen.c | 10 +++++-----
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d88e1c9..1ed9acb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -648,7 +648,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	 * aperture.  One page should be enough to keep any prefetching inside
 	 * of the aperture.
 	 */
-	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
 	struct drm_mm_node *entry;
 	struct drm_i915_gem_object *obj;
 	unsigned long hole_start, hole_end;
@@ -656,19 +657,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	BUG_ON(mappable_end > end);
 
 	/* Subtract the guard page ... */
-	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
+	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
 	if (!HAS_LLC(dev))
 		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
 		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
 			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
 
 		WARN_ON(i915_gem_obj_ggtt_bound(obj));
-		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
+		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
 		if (ret)
 			DRM_DEBUG_KMS("Reservation failed\n");
 		obj->has_global_gtt_mapping = 1;
@@ -679,19 +680,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	dev_priv->gtt.base.total = end - start;
 
 	/* Clear any non-preallocated blocks */
-	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
-			     hole_start, hole_end) {
+	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
 		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
 			      hole_start, hole_end);
-		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-					       hole_start / PAGE_SIZE,
-					       count);
+		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
 	}
 
 	/* And finally clear the reserved guard page */
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       end / PAGE_SIZE - 1, 1);
+	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
 }
 
 static bool
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 27ffb4c..000ffbd 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					       u32 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *ggtt = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	struct i915_vma *vma;
@@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (gtt_offset == I915_GTT_OFFSET_NONE)
 		return obj;
 
-	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
+	vma = i915_gem_vma_create(obj, ggtt);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
 		goto err_out;
@@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	 */
 	vma->node.start = gtt_offset;
 	vma->node.size = size;
-	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
-		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
+	if (drm_mm_initialized(&ggtt->mm)) {
+		ret = drm_mm_reserve_node(&ggtt->mm, &vma->node);
 		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
 			i915_gem_vma_destroy(vma);
@@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
 
 	return obj;
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 07/29] drm/i915: Update describe_obj
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (5 preceding siblings ...)
  2013-07-31 23:59 ` [PATCH 06/29] drm/i915: Use ggtt_vm to save some typing Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-01  0:00 ` [PATCH 08/29] drm/i915: Rework __i915_gem_shrink Ben Widawsky
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Make it aware of which domain it is bound into GGTT, or PPGTT.

While modifying the function, add a global gtt flag to the object
description. Global is more interesting than aliasing since aliasing is
the default.

v2: Access VMA directly for start/size instead of helpers (Daniel)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 61ffa71..d6154cb 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -89,13 +89,20 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj)
 	}
 }
 
+static inline const char *get_global_flag(struct drm_i915_gem_object *obj)
+{
+	return obj->has_global_gtt_mapping ? "g" : " ";
+}
+
 static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
-	seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s",
+	struct i915_vma *vma;
+	seq_printf(m, "%pK: %s%s%s %8zdKiB %02x %02x %d %d %d%s%s%s",
 		   &obj->base,
 		   get_pin_flag(obj),
 		   get_tiling_flag(obj),
+		   get_global_flag(obj),
 		   obj->base.size / 1024,
 		   obj->base.read_domains,
 		   obj->base.write_domain,
@@ -111,9 +118,14 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (pinned x %d)", obj->pin_count);
 	if (obj->fence_reg != I915_FENCE_REG_NONE)
 		seq_printf(m, " (fence: %d)", obj->fence_reg);
-	if (i915_gem_obj_ggtt_bound(obj))
-		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
-			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (!i915_is_ggtt(vma->vm))
+			seq_puts(m, " (pp");
+		else
+			seq_puts(m, " (g");
+		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
+			   vma->node.start, vma->node.size);
+	}
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
 	if (obj->pin_mappable || obj->fault_mappable) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 08/29] drm/i915: Rework __i915_gem_shrink
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (6 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 07/29] drm/i915: Update describe_obj Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-05  8:59   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 09/29] drm/i915: thread address space through execbuf Ben Widawsky
                   ` (20 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

In order to do this for all VMs, it's convenient to rework the logic a
bit. This should have no functional impact.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3aaf875..3ce9d0d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1693,9 +1693,14 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 	}
 
 	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
-		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
-		    i915_gem_object_unbind(obj) == 0 &&
-		    i915_gem_object_put_pages(obj) == 0) {
+
+		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
+			continue;
+
+		if (i915_gem_object_unbind(obj))
+			continue;
+
+		if (!i915_gem_object_put_pages(obj)) {
 			count += obj->base.size >> PAGE_SHIFT;
 			if (count >= target)
 				return count;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 09/29] drm/i915: thread address space through execbuf
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (7 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 08/29] drm/i915: Rework __i915_gem_shrink Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-05  9:39   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 10/29] drm/i915: make caching operate on all address spaces Ben Widawsky
                   ` (19 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This represents the first half of hooking up VMs to execbuf. Here we
basically pass an address space all around to the different internal
functions. It should be much more readable, and have less risk than the
second half, which begins switching over to using VMAs instead of an
obj,vm.

The overall series echoes this style of, "add a VM, then make it smart
later"

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 77 +++++++++++++++++++-----------
 1 file changed, 49 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 4c8d20f..a23b80f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 				   struct eb_objects *eb,
-				   struct drm_i915_gem_relocation_entry *reloc)
+				   struct drm_i915_gem_relocation_entry *reloc,
+				   struct i915_address_space *vm)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_gem_object *target_obj;
@@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 
 static int
 i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
-				    struct eb_objects *eb)
+				    struct eb_objects *eb,
+				    struct i915_address_space *vm)
 {
 #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
 	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
@@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 		do {
 			u64 offset = r->presumed_offset;
 
-			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
+			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
+								 vm);
 			if (ret)
 				return ret;
 
@@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 static int
 i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 					 struct eb_objects *eb,
-					 struct drm_i915_gem_relocation_entry *relocs)
+					 struct drm_i915_gem_relocation_entry *relocs,
+					 struct i915_address_space *vm)
 {
 	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 	int i, ret;
 
 	for (i = 0; i < entry->relocation_count; i++) {
-		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
+		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
+							 vm);
 		if (ret)
 			return ret;
 	}
@@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate(struct eb_objects *eb)
+i915_gem_execbuffer_relocate(struct eb_objects *eb,
+			     struct i915_address_space *vm)
 {
 	struct drm_i915_gem_object *obj;
 	int ret = 0;
@@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
 	 */
 	pagefault_disable();
 	list_for_each_entry(obj, &eb->objects, exec_list) {
-		ret = i915_gem_execbuffer_relocate_object(obj, eb);
+		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
 		if (ret)
 			break;
 	}
@@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 				   struct intel_ring_buffer *ring,
+				   struct i915_address_space *vm,
 				   bool *need_reloc)
 {
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
@@ -409,9 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->tiling_mode != I915_TILING_NONE;
 	need_mappable = need_fence || need_reloc_mappable(obj);
 
-	/* FIXME: vm plubming */
-	ret = i915_gem_object_pin(obj, &dev_priv->gtt.base, entry->alignment,
-				  need_mappable, false);
+	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
+				  false);
 	if (ret)
 		return ret;
 
@@ -438,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->has_aliasing_ppgtt_mapping = 1;
 	}
 
-	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
-		entry->offset = i915_gem_obj_ggtt_offset(obj);
+	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
+		entry->offset = i915_gem_obj_offset(obj, vm);
 		*need_reloc = true;
 	}
 
@@ -477,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			    struct list_head *objects,
+			    struct i915_address_space *vm,
 			    bool *need_relocs)
 {
 	struct drm_i915_gem_object *obj;
@@ -531,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 		list_for_each_entry(obj, objects, exec_list) {
 			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 			bool need_fence, need_mappable;
+			u32 obj_offset;
 
-			if (!i915_gem_obj_ggtt_bound(obj))
+			if (!i915_gem_obj_bound(obj, vm))
 				continue;
 
+			obj_offset = i915_gem_obj_offset(obj, vm);
 			need_fence =
 				has_fenced_gpu_access &&
 				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 				obj->tiling_mode != I915_TILING_NONE;
 			need_mappable = need_fence || need_reloc_mappable(obj);
 
+			BUG_ON((need_mappable || need_fence) &&
+			       !i915_is_ggtt(vm));
+
 			if ((entry->alignment &&
-			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
+			     obj_offset & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
 				ret = i915_gem_object_unbind(obj);
 			else
-				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
+				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
 				goto err;
 		}
 
 		/* Bind fresh objects */
 		list_for_each_entry(obj, objects, exec_list) {
-			if (i915_gem_obj_ggtt_bound(obj))
+			if (i915_gem_obj_bound(obj, vm))
 				continue;
 
-			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
+			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
 				goto err;
 		}
@@ -580,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 				  struct drm_file *file,
 				  struct intel_ring_buffer *ring,
 				  struct eb_objects *eb,
-				  struct drm_i915_gem_exec_object2 *exec)
+				  struct drm_i915_gem_exec_object2 *exec,
+				  struct i915_address_space *vm)
 {
 	struct drm_i915_gem_relocation_entry *reloc;
 	struct drm_i915_gem_object *obj;
@@ -664,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 		goto err;
 
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
 	if (ret)
 		goto err;
 
 	list_for_each_entry(obj, &eb->objects, exec_list) {
 		int offset = obj->exec_entry - exec;
 		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
-							       reloc + reloc_offset[offset]);
+							       reloc + reloc_offset[offset],
+							       vm);
 		if (ret)
 			goto err;
 	}
@@ -772,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
 
 static void
 i915_gem_execbuffer_move_to_active(struct list_head *objects,
+				   struct i915_address_space *vm,
 				   struct intel_ring_buffer *ring)
 {
 	struct drm_i915_gem_object *obj;
@@ -840,7 +855,8 @@ static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec)
+		       struct drm_i915_gem_exec_object2 *exec,
+		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct eb_objects *eb;
@@ -1002,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	/* Move the objects en-masse into the GTT, evicting if necessary. */
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
 	if (ret)
 		goto err;
 
 	/* The objects are in their final locations, apply the relocations. */
 	if (need_relocs)
-		ret = i915_gem_execbuffer_relocate(eb);
+		ret = i915_gem_execbuffer_relocate(eb, vm);
 	if (ret) {
 		if (ret == -EFAULT) {
 			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
-								eb, exec);
+								eb, exec, vm);
 			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 		}
 		if (ret)
@@ -1063,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
+	exec_start = i915_gem_obj_offset(batch_obj, vm) +
+		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
@@ -1088,7 +1105,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
-	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
+	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
 	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
 
 err:
@@ -1109,6 +1126,7 @@ int
 i915_gem_execbuffer(struct drm_device *dev, void *data,
 		    struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer *args = data;
 	struct drm_i915_gem_execbuffer2 exec2;
 	struct drm_i915_gem_exec_object *exec_list = NULL;
@@ -1164,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 	exec2.flags = I915_EXEC_RENDER;
 	i915_execbuffer2_set_context_id(exec2, 0);
 
-	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
+	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
+				     &dev_priv->gtt.base);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		for (i = 0; i < args->buffer_count; i++)
@@ -1190,6 +1209,7 @@ int
 i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		     struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
 	int ret;
@@ -1220,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
+	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
+				     &dev_priv->gtt.base);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 10/29] drm/i915: make caching operate on all address spaces
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (8 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 09/29] drm/i915: thread address space through execbuf Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-05  9:41   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 11/29] drm/i915: BUG_ON put_pages later Ben Widawsky
                   ` (18 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

For now, objects will maintain the same cache levels amongst all address
spaces. This is to limit the risk of bugs, as playing with cacheability
in the different domains can be very error prone.

In the future, it may be optimal to allow setting domains per VMA (ie.
an object bound into an address space).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3ce9d0d..adb0a18 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3308,7 +3308,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
+	struct i915_vma *vma;
 	int ret;
 
 	if (obj->cache_level == cache_level)
@@ -3319,13 +3319,17 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		return -EBUSY;
 	}
 
-	if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
-		ret = i915_gem_object_unbind(obj);
-		if (ret)
-			return ret;
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
+			ret = i915_gem_object_unbind(obj);
+			if (ret)
+				return ret;
+
+			break;
+		}
 	}
 
-	if (i915_gem_obj_ggtt_bound(obj)) {
+	if (i915_gem_obj_bound_any(obj)) {
 		ret = i915_gem_object_finish_gpu(obj);
 		if (ret)
 			return ret;
@@ -3347,8 +3351,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		if (obj->has_aliasing_ppgtt_mapping)
 			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
 					       obj, cache_level);
-
-		i915_gem_obj_to_vma(obj, &dev_priv->gtt.base)->node.color = cache_level;
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -3374,6 +3376,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 						    old_write_domain);
 	}
 
+	list_for_each_entry(vma, &obj->vma_list, vma_link)
+		vma->node.color = cache_level;
 	obj->cache_level = cache_level;
 	i915_gem_verify_gtt(dev);
 	return 0;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 11/29] drm/i915: BUG_ON put_pages later
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (9 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 10/29] drm/i915: make caching operate on all address spaces Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-05  9:42   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 12/29] drm/i915: make reset&hangcheck code VM aware Ben Widawsky
                   ` (17 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With multiple VMs, the eviction code benefits from being able to blindly
put pages without needing to know if there are any entities still
holding on to those pages. As such it's preferable to return the -EBUSY
before the BUG.

Eviction code is the only user for now, but overall it makes sense
anyway, IMO.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index adb0a18..dbf72d5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1654,11 +1654,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	if (obj->pages == NULL)
 		return 0;
 
-	BUG_ON(i915_gem_obj_ggtt_bound(obj));
-
 	if (obj->pages_pin_count)
 		return -EBUSY;
 
+	BUG_ON(i915_gem_obj_ggtt_bound(obj));
+
 	/* ->put_pages might need to allocate memory for the bit17 swizzle
 	 * array, hence protect them from being reaped by removing them from gtt
 	 * lists early. */
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 12/29] drm/i915: make reset&hangcheck code VM aware
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (10 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 11/29] drm/i915: BUG_ON put_pages later Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-01  0:00 ` [PATCH 13/29] drm/i915: clear domains for all objects on reset Ben Widawsky
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Hangcheck, and some of the recent reset code for guilty batches need to
know which address space the object was in at the time of a hangcheck.
This is because we use offsets in the (PP|G)GTT to determine this
information, and those offsets can differ depending on which VM they are
bound into.

Since we still only have 1 VM ever, this code shouldn't yet any any
impact.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dbf72d5..b4c35f0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2110,10 +2110,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 	spin_unlock(&file_priv->mm.lock);
 }
 
-static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
+static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm)
 {
-	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
-	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
+	if (acthd >= i915_gem_obj_offset(obj, vm) &&
+	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
 		return true;
 
 	return false;
@@ -2136,6 +2137,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
 	return false;
 }
 
+static struct i915_address_space *
+request_to_vm(struct drm_i915_gem_request *request)
+{
+	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
+	struct i915_address_space *vm;
+
+	vm = &dev_priv->gtt.base;
+
+	return vm;
+}
+
 static bool i915_request_guilty(struct drm_i915_gem_request *request,
 				const u32 acthd, bool *inside)
 {
@@ -2143,9 +2155,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
 	 * pointing inside the ring, matches the batch_obj address range.
 	 * However this is extremely unlikely.
 	 */
-
 	if (request->batch_obj) {
-		if (i915_head_inside_object(acthd, request->batch_obj)) {
+		if (i915_head_inside_object(acthd, request->batch_obj,
+					    request_to_vm(request))) {
 			*inside = true;
 			return true;
 		}
@@ -2165,17 +2177,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
 {
 	struct i915_ctx_hang_stats *hs = NULL;
 	bool inside, guilty;
+	unsigned long offset = 0;
 
 	/* Innocent until proven guilty */
 	guilty = false;
 
+	if (request->batch_obj)
+		offset = i915_gem_obj_offset(request->batch_obj,
+					     request_to_vm(request));
+
 	if (ring->hangcheck.action != wait &&
 	    i915_request_guilty(request, acthd, &inside)) {
 		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
 			  ring->name,
 			  inside ? "inside" : "flushing",
-			  request->batch_obj ?
-			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
+			  offset,
 			  request->ctx ? request->ctx->id : 0,
 			  acthd);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 13/29] drm/i915: clear domains for all objects on reset
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (11 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 12/29] drm/i915: make reset&hangcheck code VM aware Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-03 10:59   ` Chris Wilson
  2013-08-05 16:46   ` [PATCH 13/29] drm/i915: eliminate dead domain clearing " Ben Widawsky
  2013-08-01  0:00 ` [PATCH 14/29] drm/i915: Restore PDEs on gtt restore Ben Widawsky
                   ` (15 subsequent siblings)
  28 siblings, 2 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Simply iterating over 1 inactive list is insufficient for the way we now
track inactive (1 list per address space). We could alternatively do
this with bound + unbound lists, and an inactive check. To me, this way
is a bit easier to understand.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b4c35f0..8ce3545 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2282,7 +2282,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
 	int i;
@@ -2293,8 +2293,9 @@ void i915_gem_reset(struct drm_device *dev)
 	/* Move everything out of the GPU domains to ensure we do any
 	 * necessary invalidation upon reuse.
 	 */
-	list_for_each_entry(obj, &vm->inactive_list, mm_list)
-		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		list_for_each_entry(obj, &vm->inactive_list, mm_list)
+			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 
 	i915_gem_restore_fences(dev);
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 14/29] drm/i915: Restore PDEs on gtt restore
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (12 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 13/29] drm/i915: clear domains for all objects on reset Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-06 18:14   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 15/29] drm/i915: Improve VMA comments Ben Widawsky
                   ` (14 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

I can't remember why I added this initially.

TODO: Throw out if not necessary

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1ed9acb..e9b269f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -470,6 +470,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.start / PAGE_SIZE,
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
+	if (dev_priv->mm.aliasing_ppgtt)
+		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
+
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		i915_gem_clflush_object(obj);
 		i915_gem_gtt_bind_object(obj, obj->cache_level);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 15/29] drm/i915: Improve VMA comments
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (13 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 14/29] drm/i915: Restore PDEs on gtt restore Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-01  0:00 ` [PATCH 16/29] drm/i915: Cleanup more of VMA in destroy Ben Widawsky
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2579e96..dbfffb2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -533,7 +533,12 @@ struct i915_hw_ppgtt {
 	int (*enable)(struct drm_device *dev);
 };
 
-/* To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
  * will always be <= an objects lifetime. So object refcounting should cover us.
  */
 struct i915_vma {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 16/29] drm/i915: Cleanup more of VMA in destroy
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (14 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 15/29] drm/i915: Improve VMA comments Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-01  0:00 ` [PATCH 17/29] drm/i915: plumb VM into bind/unbind code Ben Widawsky
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Remove the VMA from the object's list, and remove the VMA's node from
the allocator. This just helps avoid duplication, since we always want
those two things to occur on destroy, and at least for now, we only do
those two actions on destroy anyway.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8ce3545..4b669e8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2640,8 +2640,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	obj->map_and_fenceable = true;
 
 	vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
-	list_del(&vma->vma_link);
-	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
 
 	/* Since the unbound list is global, only move to that list if
@@ -3176,7 +3174,6 @@ search_free:
 	return 0;
 
 err_out:
-	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
 	i915_gem_object_unpin_pages(obj);
 	return ret;
@@ -4020,7 +4017,8 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 
 void i915_gem_vma_destroy(struct i915_vma *vma)
 {
-	WARN_ON(vma->node.allocated);
+	list_del_init(&vma->vma_link);
+	drm_mm_remove_node(&vma->node);
 	kfree(vma);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 17/29] drm/i915: plumb VM into bind/unbind code
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (15 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 16/29] drm/i915: Cleanup more of VMA in destroy Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-06 18:29   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code Ben Widawsky
                   ` (11 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

As alluded to in several patches, and it will be reiterated later... A
VMA is an abstraction for a GEM BO bound into an address space.
Therefore it stands to reason, that the existing bind, and unbind are
the ones which will be the most impacted. This patch implements this,
and updates all callers which weren't already updated in the series
(because it was too messy).

This patch represents the bulk of an earlier, larger patch. I've pulled
out a bunch of things by the request of Daniel. The history is preserved
for posterity with the email convention of ">" One big change from the
original patch aside from a bunch of cropping is I've created an
i915_vma_unbind() function. That is because we always have the VMA
anyway, and doing an extra lookup is useful. There is a caveat, we
retain an i915_gem_object_ggtt_unbind, for the global cases which might
not talk in VMAs.

> drm/i915: plumb VM into object operations
>
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
>
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
>
> Some code will still need to be ported over after this.
>
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
>
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
>
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
>
> v5: Very large rebase
>
> v6:
> Change BUG_ON to WARN_ON (Daniel)
> Rename vm to ggtt in preallocate stolen, since it is always ggtt when
> dealing with stolen memory. (Daniel)
> list_for_each will short-circuit already (Daniel)
> remove superflous space (Daniel)
> Use per object list of vmas (Daniel)
> Make obj_bound_any() use obj_bound for each vm (Ben)
> s/bind_to_gtt/bind_to_vm/ (Ben)
>
> Fixed up the inactive shrinker. As Daniel noticed the code could
> potentially count the same object multiple times. While it's not
> possible in the current case, since 1 object can only ever be bound into
> 1 address space thus far - we may as well try to get something more
> future proof in place now. With a prep patch before this to switch over
> to using the bound list + inactive check, we're now able to carry that
> forward for every address space an object is bound into.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +-
 drivers/gpu/drm/i915/i915_drv.h            |   3 +-
 drivers/gpu/drm/i915/i915_gem.c            | 134 +++++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_evict.c      |   4 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   2 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |   9 +-
 drivers/gpu/drm/i915/i915_trace.h          |  37 ++++----
 7 files changed, 120 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d6154cb..6d5ca85bd 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1796,7 +1796,7 @@ i915_drop_caches_set(void *data, u64 val)
 					 mm_list) {
 			if (obj->pin_count)
 				continue;
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_gem_object_ggtt_unbind(obj);
 			if (ret)
 				goto unlock;
 		}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index dbfffb2..0610588 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1700,7 +1700,8 @@ int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     bool map_and_fenceable,
 				     bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
-int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
+int __must_check i915_vma_unbind(struct i915_vma *vma);
+int __must_check i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj);
 int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
 void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4b669e8..0cb36c2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -38,10 +38,12 @@
 
 static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
-static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
-						    unsigned alignment,
-						    bool map_and_fenceable,
-						    bool nonblocking);
+static __must_check int
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		  bool purgeable_only)
 {
 	struct drm_i915_gem_object *obj, *next;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	long count = 0;
 
 	list_for_each_entry_safe(obj, next,
@@ -1692,13 +1693,16 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		}
 	}
 
-	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
+	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
+				 global_list) {
+		struct i915_vma *vma, *v;
 
 		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
 			continue;
 
-		if (i915_gem_object_unbind(obj))
-			continue;
+		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
+			if (i915_vma_unbind(vma))
+				break;
 
 		if (!i915_gem_object_put_pages(obj)) {
 			count += obj->base.size >> PAGE_SHIFT;
@@ -2591,17 +2595,13 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
 					    old_write_domain);
 }
 
-/**
- * Unbinds an object from the GTT aperture.
- */
-int
-i915_gem_object_unbind(struct drm_i915_gem_object *obj)
+int i915_vma_unbind(struct i915_vma *vma)
 {
+	struct drm_i915_gem_object *obj = vma->obj;
 	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
-	struct i915_vma *vma;
 	int ret;
 
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (list_empty(&vma->vma_link))
 		return 0;
 
 	if (obj->pin_count)
@@ -2624,7 +2624,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	if (ret)
 		return ret;
 
-	trace_i915_gem_object_unbind(obj);
+	trace_i915_vma_unbind(vma);
 
 	if (obj->has_global_gtt_mapping)
 		i915_gem_gtt_unbind_object(obj);
@@ -2639,7 +2639,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
-	vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
 	i915_gem_vma_destroy(vma);
 
 	/* Since the unbound list is global, only move to that list if
@@ -2652,6 +2651,26 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	return 0;
 }
 
+/**
+ * Unbinds an object from the global GTT aperture.
+ */
+int
+i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	struct i915_address_space *ggtt = &dev_priv->gtt.base;
+
+	if (!i915_gem_obj_ggtt_bound(obj));
+		return 0;
+
+	if (obj->pin_count)
+		return -EBUSY;
+
+	BUG_ON(obj->pages == NULL);
+
+	return i915_vma_unbind(i915_gem_obj_to_vma(obj, ggtt));
+}
+
 int i915_gpu_idle(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -3069,18 +3088,18 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
  * Finds free space in the GTT aperture and binds the object there.
  */
 static int
-i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
-			    unsigned alignment,
-			    bool map_and_fenceable,
-			    bool nonblocking)
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking)
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
-	size_t gtt_max = map_and_fenceable ?
-		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
+	size_t gtt_max =
+		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3125,15 +3144,18 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
+	/* FIXME: For now we only ever use 1 VMA per object */
+	BUG_ON(!i915_is_ggtt(vm));
+	WARN_ON(!list_empty(&obj->vma_list));
+
+	vma = i915_gem_vma_create(obj, vm);
 	if (IS_ERR(vma)) {
 		i915_gem_object_unpin_pages(obj);
 		return PTR_ERR(vma);
 	}
 
 search_free:
-	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
-						  &vma->node,
+	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max);
 	if (ret) {
@@ -3158,18 +3180,25 @@ search_free:
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &vm->inactive_list);
-	list_add(&vma->vma_link, &obj->vma_list);
+
+	/* Keep GGTT vmas first to make debug easier */
+	if (i915_is_ggtt(vm))
+		list_add(&vma->vma_link, &obj->vma_list);
+	else
+		list_add_tail(&vma->vma_link, &obj->vma_list);
 
 	fenceable =
+		i915_is_ggtt(vm) &&
 		i915_gem_obj_ggtt_size(obj) == fence_size &&
 		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
 
-	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
-		dev_priv->gtt.mappable_end;
+	mappable =
+		i915_is_ggtt(vm) &&
+		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
 	obj->map_and_fenceable = mappable && fenceable;
 
-	trace_i915_gem_object_bind(obj, map_and_fenceable);
+	trace_i915_vma_bind(vma, map_and_fenceable);
 	i915_gem_verify_gtt(dev);
 	return 0;
 
@@ -3335,7 +3364,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 
 	list_for_each_entry(vma, &obj->vma_list, vma_link) {
 		if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_vma_unbind(vma);
 			if (ret)
 				return ret;
 
@@ -3643,33 +3672,39 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	struct i915_vma *vma;
 	int ret;
 
 	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
 		return -EBUSY;
 
-	if (i915_gem_obj_ggtt_bound(obj)) {
-		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
+	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
+
+	vma = i915_gem_obj_to_vma(obj, vm);
+
+	if (vma) {
+		if ((alignment &&
+		     vma->node.start & (alignment - 1)) ||
 		    (map_and_fenceable && !obj->map_and_fenceable)) {
 			WARN(obj->pin_count,
 			     "bo is already pinned with incorrect alignment:"
 			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
 			     " obj->map_and_fenceable=%d\n",
-			     i915_gem_obj_ggtt_offset(obj), alignment,
+			     i915_gem_obj_offset(obj, vm), alignment,
 			     map_and_fenceable,
 			     obj->map_and_fenceable);
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_vma_unbind(vma);
 			if (ret)
 				return ret;
 		}
 	}
 
-	if (!i915_gem_obj_ggtt_bound(obj)) {
+	if (!i915_gem_obj_bound(obj, vm)) {
 		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 
-		ret = i915_gem_object_bind_to_gtt(obj, alignment,
-						  map_and_fenceable,
-						  nonblocking);
+		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
+						 map_and_fenceable,
+						 nonblocking);
 		if (ret)
 			return ret;
 
@@ -3961,6 +3996,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_vma *vma, *next;
 
 	trace_i915_gem_object_destroy(obj);
 
@@ -3968,15 +4004,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		i915_gem_detach_phys_object(dev, obj);
 
 	obj->pin_count = 0;
-	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
-		bool was_interruptible;
+	/* NB: 0 or 1 elements */
+	WARN_ON(!list_empty(&obj->vma_list) &&
+		!list_is_singular(&obj->vma_list));
+	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
+		int ret = i915_vma_unbind(vma);
+		if (WARN_ON(ret == -ERESTARTSYS)) {
+			bool was_interruptible;
 
-		was_interruptible = dev_priv->mm.interruptible;
-		dev_priv->mm.interruptible = false;
+			was_interruptible = dev_priv->mm.interruptible;
+			dev_priv->mm.interruptible = false;
 
-		WARN_ON(i915_gem_object_unbind(obj));
+			WARN_ON(i915_vma_unbind(vma));
 
-		dev_priv->mm.interruptible = was_interruptible;
+			dev_priv->mm.interruptible = was_interruptible;
+		}
 	}
 
 	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 33d85a4..9205a41 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -147,7 +147,7 @@ found:
 				       struct drm_i915_gem_object,
 				       exec_list);
 		if (ret == 0)
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_gem_object_ggtt_unbind(obj);
 
 		list_del_init(&obj->exec_list);
 		drm_gem_object_unreference(&obj->base);
@@ -185,7 +185,7 @@ i915_gem_evict_everything(struct drm_device *dev)
 	/* Having flushed everything, unbind() should never raise an error */
 	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
 		if (obj->pin_count == 0)
-			WARN_ON(i915_gem_object_unbind(obj));
+			WARN_ON(i915_gem_object_ggtt_unbind(obj));
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a23b80f..5e68f1e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -556,7 +556,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			if ((entry->alignment &&
 			     obj_offset & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
-				ret = i915_gem_object_unbind(obj);
+				ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));
 			else
 				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 92a8d27..032e9ef 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -360,17 +360,18 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 
 		obj->map_and_fenceable =
 			!i915_gem_obj_ggtt_bound(obj) ||
-			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
+			(i915_gem_obj_ggtt_offset(obj) +
+			 obj->base.size <= dev_priv->gtt.mappable_end &&
 			 i915_gem_object_fence_ok(obj, args->tiling_mode));
 
 		/* Rebind if we need a change of alignment */
 		if (!obj->map_and_fenceable) {
-			u32 unfenced_alignment =
+			u32 unfenced_align =
 				i915_gem_get_gtt_alignment(dev, obj->base.size,
 							    args->tiling_mode,
 							    false);
-			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
-				ret = i915_gem_object_unbind(obj);
+			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
+				ret = i915_gem_object_ggtt_unbind(obj);
 		}
 
 		if (ret == 0) {
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 7d283b5..931e2c6 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -33,47 +33,52 @@ TRACE_EVENT(i915_gem_object_create,
 	    TP_printk("obj=%p, size=%u", __entry->obj, __entry->size)
 );
 
-TRACE_EVENT(i915_gem_object_bind,
-	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
-	    TP_ARGS(obj, mappable),
+TRACE_EVENT(i915_vma_bind,
+	    TP_PROTO(struct i915_vma *vma, bool mappable),
+	    TP_ARGS(vma, mappable),
 
 	    TP_STRUCT__entry(
 			     __field(struct drm_i915_gem_object *, obj)
+			     __field(struct i915_address_space *, vm)
 			     __field(u32, offset)
 			     __field(u32, size)
 			     __field(bool, mappable)
 			     ),
 
 	    TP_fast_assign(
-			   __entry->obj = obj;
-			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
-			   __entry->size = i915_gem_obj_ggtt_size(obj);
+			   __entry->obj = vma->obj;
+			   __entry->vm = vma->vm;
+			   __entry->offset = vma->node.start;
+			   __entry->size = vma->node.size;
 			   __entry->mappable = mappable;
 			   ),
 
-	    TP_printk("obj=%p, offset=%08x size=%x%s",
+	    TP_printk("obj=%p, offset=%08x size=%x%s vm=%p",
 		      __entry->obj, __entry->offset, __entry->size,
-		      __entry->mappable ? ", mappable" : "")
+		      __entry->mappable ? ", mappable" : "",
+		      __entry->vm)
 );
 
-TRACE_EVENT(i915_gem_object_unbind,
-	    TP_PROTO(struct drm_i915_gem_object *obj),
-	    TP_ARGS(obj),
+TRACE_EVENT(i915_vma_unbind,
+	    TP_PROTO(struct i915_vma *vma),
+	    TP_ARGS(vma),
 
 	    TP_STRUCT__entry(
 			     __field(struct drm_i915_gem_object *, obj)
+			     __field(struct i915_address_space *, vm)
 			     __field(u32, offset)
 			     __field(u32, size)
 			     ),
 
 	    TP_fast_assign(
-			   __entry->obj = obj;
-			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
-			   __entry->size = i915_gem_obj_ggtt_size(obj);
+			   __entry->obj = vma->obj;
+			   __entry->vm = vma->vm;
+			   __entry->offset = vma->node.start;
+			   __entry->size = vma->node.size;
 			   ),
 
-	    TP_printk("obj=%p, offset=%08x size=%x",
-		      __entry->obj, __entry->offset, __entry->size)
+	    TP_printk("obj=%p, offset=%08x size=%x vm=%p",
+		      __entry->obj, __entry->offset, __entry->size, __entry->vm)
 );
 
 TRACE_EVENT(i915_gem_object_change_domain,
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (16 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 17/29] drm/i915: plumb VM into bind/unbind code Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-06 18:39   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any Ben Widawsky
                   ` (10 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Eviction code, like the rest of the converted code needs to be aware of
the address space for which it is evicting (or the everything case, all
addresses). With the updated bind/unbind interfaces of the last patch,
we can now safely move the eviction code over.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h       |  4 ++-
 drivers/gpu/drm/i915/i915_gem.c       |  2 +-
 drivers/gpu/drm/i915/i915_gem_evict.c | 53 +++++++++++++++++++----------------
 3 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0610588..bf1ecef 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1946,7 +1946,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
 
 
 /* i915_gem_evict.c */
-int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
+int __must_check i915_gem_evict_something(struct drm_device *dev,
+					  struct i915_address_space *vm,
+					  int min_size,
 					  unsigned alignment,
 					  unsigned cache_level,
 					  bool mappable,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0cb36c2..1013105 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3159,7 +3159,7 @@ search_free:
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max);
 	if (ret) {
-		ret = i915_gem_evict_something(dev, size, alignment,
+		ret = i915_gem_evict_something(dev, vm, size, alignment,
 					       obj->cache_level,
 					       map_and_fenceable,
 					       nonblocking);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 9205a41..61bf5e2 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -32,26 +32,21 @@
 #include "i915_trace.h"
 
 static bool
-mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
+mark_free(struct i915_vma *vma, struct list_head *unwind)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
-
-	if (obj->pin_count)
+	if (vma->obj->pin_count)
 		return false;
 
-	list_add(&obj->exec_list, unwind);
+	list_add(&vma->obj->exec_list, unwind);
 	return drm_mm_scan_add_block(&vma->node);
 }
 
 int
-i915_gem_evict_something(struct drm_device *dev, int min_size,
-			 unsigned alignment, unsigned cache_level,
+i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
+			 int min_size, unsigned alignment, unsigned cache_level,
 			 bool mappable, bool nonblocking)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct list_head eviction_list, unwind_list;
 	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
@@ -83,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 	 */
 
 	INIT_LIST_HEAD(&unwind_list);
-	if (mappable)
+	if (mappable) {
+		BUG_ON(!i915_is_ggtt(vm));
 		drm_mm_init_scan_with_range(&vm->mm, min_size,
 					    alignment, cache_level, 0,
 					    dev_priv->gtt.mappable_end);
-	else
+	} else
 		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
 	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
-		if (mark_free(obj, &unwind_list))
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
 
@@ -101,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 
 	/* Now merge in the soon-to-be-expired objects... */
 	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		if (mark_free(obj, &unwind_list))
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
 
@@ -111,7 +109,7 @@ none:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
+		vma = i915_gem_obj_to_vma(obj, vm);
 		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
@@ -132,7 +130,7 @@ found:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
+		vma = i915_gem_obj_to_vma(obj, vm);
 		if (drm_mm_scan_remove_block(&vma->node)) {
 			list_move(&obj->exec_list, &eviction_list);
 			drm_gem_object_reference(&obj->base);
@@ -147,7 +145,7 @@ found:
 				       struct drm_i915_gem_object,
 				       exec_list);
 		if (ret == 0)
-			ret = i915_gem_object_ggtt_unbind(obj);
+			ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));
 
 		list_del_init(&obj->exec_list);
 		drm_gem_object_unreference(&obj->base);
@@ -160,13 +158,18 @@ int
 i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj, *next;
-	bool lists_empty;
+	bool lists_empty = true;
 	int ret;
 
-	lists_empty = (list_empty(&vm->inactive_list) &&
-		       list_empty(&vm->active_list));
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		lists_empty = (list_empty(&vm->inactive_list) &&
+			       list_empty(&vm->active_list));
+		if (!lists_empty)
+			lists_empty = false;
+	}
+
 	if (lists_empty)
 		return -ENOSPC;
 
@@ -183,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
 	i915_gem_retire_requests(dev);
 
 	/* Having flushed everything, unbind() should never raise an error */
-	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
-		if (obj->pin_count == 0)
-			WARN_ON(i915_gem_object_ggtt_unbind(obj));
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
+			if (obj->pin_count == 0)
+				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
+	}
 
 	return 0;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (17 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-03 11:03   ` Chris Wilson
  2013-08-06 18:43   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 20/29] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
                   ` (9 subsequent siblings)
  28 siblings, 2 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

In some places, we want to know if an object is bound in any address
space, and not just the global GTT. This often applies when there is a
single global resource (object, pages, etc.)

function                             |      reason
--------------------------------------------------
i915_gem_object_is_inactive          | global object
i915_gem_object_put_pages            | object's pages
915_gem_object_unpin                 | global object
i915_gem_execbuffer_unreserve_object | temporary until we plumb vma
pread/pread                          | object's domain

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c            | 12 ++++++------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1013105..d4d6444 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -122,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
+	return i915_gem_obj_bound_any(obj) && !obj->active;
 }
 
 int
@@ -408,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		 * anyway again before the next pread happens. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush = 1;
-		if (i915_gem_obj_ggtt_bound(obj)) {
+		if (i915_gem_obj_bound_any(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, false);
 			if (ret)
 				return ret;
@@ -725,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		 * right away and we therefore have to clflush anyway. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush_after = 1;
-		if (i915_gem_obj_ggtt_bound(obj)) {
+		if (i915_gem_obj_bound_any(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, true);
 			if (ret)
 				return ret;
@@ -1659,7 +1659,7 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	if (obj->pages_pin_count)
 		return -EBUSY;
 
-	BUG_ON(i915_gem_obj_ggtt_bound(obj));
+	BUG_ON(i915_gem_obj_bound_any(obj));
 
 	/* ->put_pages might need to allocate memory for the bit17 swizzle
 	 * array, hence protect them from being reaped by removing them from gtt
@@ -3301,7 +3301,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	int ret;
 
 	/* Not valid to be called on unbound objects. */
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (!i915_gem_obj_bound_any(obj))
 		return -EINVAL;
 
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
@@ -3725,7 +3725,7 @@ void
 i915_gem_object_unpin(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pin_count == 0);
-	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
+	BUG_ON(!i915_gem_obj_bound_any(obj));
 
 	if (--obj->pin_count == 0)
 		obj->pin_mappable = false;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 5e68f1e..64dc6b5 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -466,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_gem_exec_object2 *entry;
 
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (!i915_gem_obj_bound_any(obj))
 		return;
 
 	entry = obj->exec_entry;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 20/29] drm/i915: Fix up map and fenceable for VMA
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (18 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-06 19:11   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 21/29] drm/i915: mm_list is per VMA Ben Widawsky
                   ` (8 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
tracking"

The map_and_fenceable tracking is per object. GTT mapping, and fences
only apply to global GTT. As such,  object operations which are not
performed on the global GTT should not effect mappable or fenceable
characteristics.

Functionally, this commit could very well be squashed in to a previous
patch which updated object operations to take a VM argument.  This
commit is split out because it's a bit tricky (or at least it was for
me).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d4d6444..ec23a5c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2626,7 +2626,7 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping)
+	if (obj->has_global_gtt_mapping && i915_is_ggtt(vma->vm))
 		i915_gem_gtt_unbind_object(obj);
 	if (obj->has_aliasing_ppgtt_mapping) {
 		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
@@ -2637,7 +2637,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	list_del(&obj->mm_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
-	obj->map_and_fenceable = true;
+	if (i915_is_ggtt(vma->vm))
+		obj->map_and_fenceable = true;
 
 	i915_gem_vma_destroy(vma);
 
@@ -3196,7 +3197,9 @@ search_free:
 		i915_is_ggtt(vm) &&
 		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
-	obj->map_and_fenceable = mappable && fenceable;
+	/* Map and fenceable only changes if the VM is the global GGTT */
+	if (i915_is_ggtt(vm))
+		obj->map_and_fenceable = mappable && fenceable;
 
 	trace_i915_vma_bind(vma, map_and_fenceable);
 	i915_gem_verify_gtt(dev);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 21/29] drm/i915: mm_list is per VMA
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (19 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 20/29] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-06 19:38   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 22/29] drm/i915: Update error capture for VMs Ben Widawsky
                   ` (7 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

formerly: "drm/i915: Create VMAs (part 5) - move mm_list"

The mm_list is used for the active/inactive LRUs. Since those LRUs are
per address space, the link should be per VMx .

Because we'll only ever have 1 VMA before this point, it's not incorrect
to defer this change until this point in the patch series, and doing it
here makes the change much easier to understand.

Shamelessly manipulated out of Daniel:
"active/inactive stuff is used by eviction when we run out of address
space, so needs to be per-vma and per-address space. Bound/unbound otoh
is used by the shrinker which only cares about the amount of memory used
and not one bit about in which address space this memory is all used in.
Of course to actual kick out an object we need to unbind it from every
address space, but for that we have the per-object list of vmas."

v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)

v3: Moved earlier in the series

v4: Add dropped message from v3

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 53 ++++++++++++++++++++----------
 drivers/gpu/drm/i915/i915_drv.h            |  5 +--
 drivers/gpu/drm/i915/i915_gem.c            | 37 +++++++++++----------
 drivers/gpu/drm/i915/i915_gem_context.c    |  3 ++
 drivers/gpu/drm/i915/i915_gem_evict.c      | 14 ++++----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 ++
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      | 37 ++++++++++++---------
 8 files changed, 91 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6d5ca85bd..181e5a6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -149,7 +149,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	size_t total_obj_size, total_gtt_size;
 	int count, ret;
 
@@ -157,6 +157,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	if (ret)
 		return ret;
 
+	/* FIXME: the user of this interface might want more than just GGTT */
 	switch (list) {
 	case ACTIVE_LIST:
 		seq_puts(m, "Active:\n");
@@ -172,12 +173,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	}
 
 	total_obj_size = total_gtt_size = count = 0;
-	list_for_each_entry(obj, head, mm_list) {
-		seq_puts(m, "   ");
-		describe_obj(m, obj);
-		seq_putc(m, '\n');
-		total_obj_size += obj->base.size;
-		total_gtt_size += i915_gem_obj_ggtt_size(obj);
+	list_for_each_entry(vma, head, mm_list) {
+		seq_printf(m, "   ");
+		describe_obj(m, vma->obj);
+		seq_printf(m, "\n");
+		total_obj_size += vma->obj->base.size;
+		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
 		count++;
 	}
 	mutex_unlock(&dev->struct_mutex);
@@ -224,7 +225,18 @@ static int per_file_stats(int id, void *ptr, void *data)
 	return 0;
 }
 
-static int i915_gem_object_info(struct seq_file *m, void *data)
+#define count_vmas(list, member) do { \
+	list_for_each_entry(vma, list, member) { \
+		size += i915_gem_obj_ggtt_size(vma->obj); \
+		++count; \
+		if (vma->obj->map_and_fenceable) { \
+			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
+			++mappable_count; \
+		} \
+	} \
+} while (0)
+
+static int i915_gem_object_info(struct seq_file *m, void* data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
@@ -234,6 +246,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_file *file;
+	struct i915_vma *vma;
 	int ret;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -253,12 +266,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&vm->active_list, mm_list);
+	count_vmas(&vm->active_list, mm_list);
 	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&vm->inactive_list, mm_list);
+	count_vmas(&vm->inactive_list, mm_list);
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -1771,7 +1784,8 @@ i915_drop_caches_set(void *data, u64 val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj, *next;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
+	struct i915_vma *vma, *x;
 	int ret;
 
 	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
@@ -1792,13 +1806,16 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_retire_requests(dev);
 
 	if (val & DROP_BOUND) {
-		list_for_each_entry_safe(obj, next, &vm->inactive_list,
-					 mm_list) {
-			if (obj->pin_count)
-				continue;
-			ret = i915_gem_object_ggtt_unbind(obj);
-			if (ret)
-				goto unlock;
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+			list_for_each_entry_safe(vma, x, &vm->inactive_list,
+						 mm_list) {
+				if (vma->obj->pin_count)
+					continue;
+
+				ret = i915_vma_unbind(vma);
+				if (ret)
+					goto unlock;
+			}
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bf1ecef..220699b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -546,6 +546,9 @@ struct i915_vma {
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm;
 
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
 	struct list_head vma_link; /* Link in the object's VMA list */
 };
 
@@ -1263,9 +1266,7 @@ struct drm_i915_gem_object {
 	struct drm_mm_node *stolen;
 	struct list_head global_list;
 
-	/** This object's place on the active/inactive lists */
 	struct list_head ring_list;
-	struct list_head mm_list;
 	/** This object's place in the batchbuffer or on the eviction list */
 	struct list_head exec_list;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ec23a5c..fb3f02f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1872,7 +1872,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 seqno = intel_ring_get_seqno(ring);
 
 	BUG_ON(ring == NULL);
@@ -1888,8 +1887,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 		obj->active = 1;
 	}
 
-	/* Move from whatever list we were on to the tail of execution. */
-	list_move_tail(&obj->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
 	obj->last_read_seqno = seqno;
@@ -1911,14 +1908,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
+	struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_move_tail(&obj->mm_list, &vm->inactive_list);
+	list_move_tail(&vma->mm_list, &ggtt_vm->inactive_list);
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -2286,9 +2283,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm;
-	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	int i;
 
 	for_each_ring(ring, dev_priv, i)
@@ -2298,8 +2295,8 @@ void i915_gem_reset(struct drm_device *dev)
 	 * necessary invalidation upon reuse.
 	 */
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-		list_for_each_entry(obj, &vm->inactive_list, mm_list)
-			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
+		list_for_each_entry(vma, &vm->inactive_list, mm_list)
+			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 
 	i915_gem_restore_fences(dev);
 }
@@ -2353,6 +2350,7 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
 			break;
 
+		BUG_ON(!obj->active);
 		i915_gem_object_move_to_inactive(obj);
 	}
 
@@ -2635,7 +2633,6 @@ int i915_vma_unbind(struct i915_vma *vma)
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
-	list_del(&obj->mm_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
 	if (i915_is_ggtt(vma->vm))
 		obj->map_and_fenceable = true;
@@ -3180,7 +3177,7 @@ search_free:
 		goto err_out;
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	list_add_tail(&vma->mm_list, &vm->inactive_list);
 
 	/* Keep GGTT vmas first to make debug easier */
 	if (i915_is_ggtt(vm))
@@ -3342,9 +3339,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 					    old_write_domain);
 
 	/* And bump the LRU for this access */
-	if (i915_gem_object_is_inactive(obj))
-		list_move_tail(&obj->mm_list,
-			       &dev_priv->gtt.base.inactive_list);
+	if (i915_gem_object_is_inactive(obj)) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
+		if (vma)
+			list_move_tail(&vma->mm_list,
+				       &dev_priv->gtt.base.inactive_list);
+
+	}
 
 	return 0;
 }
@@ -3917,7 +3919,6 @@ unlock:
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops)
 {
-	INIT_LIST_HEAD(&obj->mm_list);
 	INIT_LIST_HEAD(&obj->global_list);
 	INIT_LIST_HEAD(&obj->ring_list);
 	INIT_LIST_HEAD(&obj->exec_list);
@@ -4054,6 +4055,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&vma->vma_link);
+	INIT_LIST_HEAD(&vma->mm_list);
 	vma->vm = vm;
 	vma->obj = obj;
 
@@ -4063,6 +4065,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 void i915_gem_vma_destroy(struct i915_vma *vma)
 {
 	list_del_init(&vma->vma_link);
+	list_del(&vma->mm_list);
 	drm_mm_remove_node(&vma->node);
 	kfree(vma);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index d1cb28c..88b0f52 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -436,7 +436,10 @@ static int do_switch(struct i915_hw_context *to)
 	 * MI_SET_CONTEXT instead of when the next seqno has completed.
 	 */
 	if (from != NULL) {
+		struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
+		struct i915_address_space *ggtt = &dev_priv->gtt.base;
 		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
+		list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
 		i915_gem_object_move_to_active(from->obj, ring);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 61bf5e2..425939b 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
-	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
 		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
@@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 		goto none;
 
 	/* Now merge in the soon-to-be-expired objects... */
-	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+	list_for_each_entry(vma, &vm->active_list, mm_list) {
 		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
@@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_address_space *vm;
-	struct drm_i915_gem_object *obj, *next;
+	struct i915_vma *vma, *next;
 	bool lists_empty = true;
 	int ret;
 
@@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
 
 	/* Having flushed everything, unbind() should never raise an error */
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
-		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
-			if (obj->pin_count == 0)
-				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
+		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
+			if (vma->obj->pin_count == 0)
+				WARN_ON(i915_vma_unbind(vma));
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 64dc6b5..0f21702 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -801,6 +801,8 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
 		obj->base.read_domains = obj->base.pending_read_domains;
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
+		/* FIXME: This lookup gets fixed later <-- danvet */
+		list_move_tail(&i915_gem_obj_to_vma(obj, vm)->mm_list, &vm->active_list);
 		i915_gem_object_move_to_active(obj, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 000ffbd..fa60103 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
+	list_add_tail(&vma->mm_list, &ggtt->inactive_list);
 
 	return obj;
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index d970d84..9623a4e 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -556,11 +556,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 static u32 capture_active_bo(struct drm_i915_error_buffer *err,
 			     int count, struct list_head *head)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	int i = 0;
 
-	list_for_each_entry(obj, head, mm_list) {
-		capture_bo(err++, obj);
+	list_for_each_entry(vma, head, mm_list) {
+		capture_bo(err++, vma->obj);
 		if (++i == count)
 			break;
 	}
@@ -622,7 +622,8 @@ static struct drm_i915_error_object *
 i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			     struct intel_ring_buffer *ring)
 {
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	u32 seqno;
 
@@ -642,20 +643,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	}
 
 	seqno = ring->get_seqno(ring, false);
-	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		if (obj->ring != ring)
-			continue;
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_for_each_entry(vma, &vm->active_list, mm_list) {
+			obj = vma->obj;
+			if (obj->ring != ring)
+				continue;
 
-		if (i915_seqno_passed(seqno, obj->last_read_seqno))
-			continue;
+			if (i915_seqno_passed(seqno, obj->last_read_seqno))
+				continue;
 
-		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
-			continue;
+			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
+				continue;
 
-		/* We need to copy these to an anonymous buffer as the simplest
-		 * method to avoid being overwritten by userspace.
-		 */
-		return i915_error_object_create(dev_priv, obj);
+			/* We need to copy these to an anonymous buffer as the simplest
+			 * method to avoid being overwritten by userspace.
+			 */
+			return i915_error_object_create(dev_priv, obj);
+		}
 	}
 
 	return NULL;
@@ -775,11 +779,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 				     struct drm_i915_error_state *error)
 {
 	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	int i;
 
 	i = 0;
-	list_for_each_entry(obj, &vm->active_list, mm_list)
+	list_for_each_entry(vma, &vm->active_list, mm_list)
 		i++;
 	error->active_bo_count = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 22/29] drm/i915: Update error capture for VMs
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (20 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 21/29] drm/i915: mm_list is per VMA Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-01  0:00 ` [PATCH 23/29] drm/i915: Add vma to list at creation Ben Widawsky
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

formerly: "drm/i915: Create VMAs (part 4) - Error capture"

Since the active/inactive lists are per VM, we need to modify the error
capture code to be aware of this, and also extend it to capture the
buffers from all the VMs. For now all the code assumes only 1 VM, but it
will become more generic over the next few patches.

NOTE: If the number of VMs in a real world system grows significantly
we'll have to focus on only capturing the guilty VM, or else it's likely
there won't be enough space for error capture.

v2: Squashed in the "part 6" which had dependencies on the mm_list
change. Since I've moved the mm_list change to an earlier point in the
series, we were able to accomplish it here and now.

v3: Rebased over new error capture

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h       |  4 +-
 drivers/gpu/drm/i915/i915_gpu_error.c | 76 ++++++++++++++++++++++++-----------
 2 files changed, 55 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 220699b..f6c2812 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -323,8 +323,8 @@ struct drm_i915_error_state {
 		u32 purgeable:1;
 		s32 ring:4;
 		u32 cache_level:2;
-	} *active_bo, *pinned_bo;
-	u32 active_bo_count, pinned_bo_count;
+	} **active_bo, **pinned_bo;
+	u32 *active_bo_count, *pinned_bo_count;
 	struct intel_overlay_error_state *overlay;
 	struct intel_display_error_state *display;
 };
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 9623a4e..b834f78 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -304,13 +304,13 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 
 	if (error->active_bo)
 		print_error_buffers(m, "Active",
-				    error->active_bo,
-				    error->active_bo_count);
+				    error->active_bo[0],
+				    error->active_bo_count[0]);
 
 	if (error->pinned_bo)
 		print_error_buffers(m, "Pinned",
-				    error->pinned_bo,
-				    error->pinned_bo_count);
+				    error->pinned_bo[0],
+				    error->pinned_bo_count[0]);
 
 	for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
 		struct drm_i915_error_object *obj;
@@ -775,42 +775,72 @@ static void i915_gem_record_rings(struct drm_device *dev,
 	}
 }
 
-static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
-				     struct drm_i915_error_state *error)
+/* FIXME: Since pin count/bound list is global, we duplicate what we capture per
+ * VM.
+ */
+static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
+				struct drm_i915_error_state *error,
+				struct i915_address_space *vm,
+				const int ndx)
 {
-	struct i915_address_space *vm = &dev_priv->gtt.base;
-	struct i915_vma *vma;
+	struct drm_i915_error_buffer *active_bo = NULL, *pinned_bo = NULL;
 	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	int i;
 
 	i = 0;
 	list_for_each_entry(vma, &vm->active_list, mm_list)
 		i++;
-	error->active_bo_count = i;
+	error->active_bo_count[ndx] = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
 		if (obj->pin_count)
 			i++;
-	error->pinned_bo_count = i - error->active_bo_count;
+	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
 
 	if (i) {
-		error->active_bo = kmalloc(sizeof(*error->active_bo)*i,
-					   GFP_ATOMIC);
-		if (error->active_bo)
-			error->pinned_bo =
-				error->active_bo + error->active_bo_count;
+		active_bo = kmalloc(sizeof(*active_bo)*i, GFP_ATOMIC);
+		if (active_bo)
+			pinned_bo = active_bo + error->active_bo_count[ndx];
 	}
 
-	if (error->active_bo)
-		error->active_bo_count =
-			capture_active_bo(error->active_bo,
-					  error->active_bo_count,
+	if (active_bo)
+		error->active_bo_count[ndx] =
+			capture_active_bo(active_bo,
+					  error->active_bo_count[ndx],
 					  &vm->active_list);
 
-	if (error->pinned_bo)
-		error->pinned_bo_count =
-			capture_pinned_bo(error->pinned_bo,
-					  error->pinned_bo_count,
+	if (pinned_bo)
+		error->pinned_bo_count[ndx] =
+			capture_pinned_bo(pinned_bo,
+					  error->pinned_bo_count[ndx],
 					  &dev_priv->mm.bound_list);
+	error->active_bo[ndx] = active_bo;
+	error->pinned_bo[ndx] = pinned_bo;
+}
+
+static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
+				     struct drm_i915_error_state *error)
+{
+	struct i915_address_space *vm;
+	int cnt = 0, i = 0;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		cnt++;
+
+	if (WARN(cnt > 1, "Multiple VMs not yet supported\n"))
+		cnt = 1;
+
+	vm = &dev_priv->gtt.base;
+
+	error->active_bo = kcalloc(cnt, sizeof(*error->active_bo), GFP_ATOMIC);
+	error->pinned_bo = kcalloc(cnt, sizeof(*error->pinned_bo), GFP_ATOMIC);
+	error->active_bo_count = kcalloc(cnt, sizeof(*error->active_bo_count),
+					 GFP_ATOMIC);
+	error->pinned_bo_count = kcalloc(cnt, sizeof(*error->pinned_bo_count),
+					 GFP_ATOMIC);
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		i915_gem_capture_vm(dev_priv, error, vm, i++);
 }
 
 /**
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 23/29] drm/i915: Add vma to list at creation
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (21 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 22/29] drm/i915: Update error capture for VMs Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-01  0:00 ` [PATCH 24/29] drm/i915: create vmas at execbuf Ben Widawsky
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With the current code there shouldn't be a distinction - however with an
upcoming change we intend to allocate a vma much earlier, before it's
actually bound anywhere.

To do this we have to check node allocation as well for the _bound()
check.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fb3f02f..21331d8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3179,12 +3179,6 @@ search_free:
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&vma->mm_list, &vm->inactive_list);
 
-	/* Keep GGTT vmas first to make debug easier */
-	if (i915_is_ggtt(vm))
-		list_add(&vma->vma_link, &obj->vma_list);
-	else
-		list_add_tail(&vma->vma_link, &obj->vma_list);
-
 	fenceable =
 		i915_is_ggtt(vm) &&
 		i915_gem_obj_ggtt_size(obj) == fence_size &&
@@ -4059,6 +4053,12 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 	vma->vm = vm;
 	vma->obj = obj;
 
+	/* Keep GGTT vmas first to make debug easier */
+	if (i915_is_ggtt(vm))
+		list_add(&vma->vma_link, &obj->vma_list);
+	else
+		list_add_tail(&vma->vma_link, &obj->vma_list);
+
 	return vma;
 }
 
@@ -4754,7 +4754,7 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
 	struct i915_vma *vma;
 
 	list_for_each_entry(vma, &o->vma_list, vma_link)
-		if (vma->vm == vm)
+		if (vma->vm == vm && drm_mm_node_allocated(&vma->node))
 			return true;
 
 	return false;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 24/29] drm/i915: create vmas at execbuf
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (22 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 23/29] drm/i915: Add vma to list at creation Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-07 20:52   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 25/29] drm/i915: Convert execbuf code to use vmas Ben Widawsky
                   ` (4 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

In order to transition more of our code over to using a VMA instead of
an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up
until now, we've only had a VMA when actually binding an object.

The previous patch helped handle the distinction on bound vs. unbound.
This patch will help us catch leaks, and other issues before we actually
shuffle a bunch of stuff around.

The subsequent patch to fix up the rest of execbuf should be mostly just
moving code around, and this is the major functional change.

v2: Release table_lock earlier so vma allocation needn't be atomic.
(Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  3 +++
 drivers/gpu/drm/i915/i915_gem.c            | 25 ++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 18 +++++++++++++-----
 3 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f6c2812..c0eb7fd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1857,6 +1857,9 @@ unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
 				struct i915_address_space *vm);
 struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm);
+struct i915_vma *
+i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
+				  struct i915_address_space *vm);
 /* Some GGTT VM helpers */
 #define obj_to_ggtt(obj) \
 	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 21331d8..72bd53c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3101,8 +3101,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	struct i915_vma *vma;
 	int ret;
 
-	if (WARN_ON(!list_empty(&obj->vma_list)))
-		return -EBUSY;
+	BUG_ON(!i915_is_ggtt(vm));
 
 	fence_size = i915_gem_get_gtt_size(dev,
 					   obj->base.size,
@@ -3142,16 +3141,15 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	/* FIXME: For now we only ever use 1 VMA per object */
-	BUG_ON(!i915_is_ggtt(vm));
-	WARN_ON(!list_empty(&obj->vma_list));
-
-	vma = i915_gem_vma_create(obj, vm);
+	vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
 	if (IS_ERR(vma)) {
 		i915_gem_object_unpin_pages(obj);
 		return PTR_ERR(vma);
 	}
 
+	/* For now we only ever use 1 vma per object */
+	WARN_ON(!list_is_singular(&obj->vma_list));
+
 search_free:
 	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
 						  size, alignment,
@@ -4800,3 +4798,16 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 
 	return NULL;
 }
+
+struct i915_vma *
+i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
+				  struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+
+	vma = i915_gem_obj_to_vma(obj, vm);
+	if (!vma)
+		vma = i915_gem_vma_create(obj, vm);
+
+	return vma;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0f21702..3f17a55 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -85,14 +85,14 @@ static int
 eb_lookup_objects(struct eb_objects *eb,
 		  struct drm_i915_gem_exec_object2 *exec,
 		  const struct drm_i915_gem_execbuffer2 *args,
+		  struct i915_address_space *vm,
 		  struct drm_file *file)
 {
+	struct drm_i915_gem_object *obj;
 	int i;
 
 	spin_lock(&file->table_lock);
 	for (i = 0; i < args->buffer_count; i++) {
-		struct drm_i915_gem_object *obj;
-
 		obj = to_intel_bo(idr_find(&file->object_idr, exec[i].handle));
 		if (obj == NULL) {
 			spin_unlock(&file->table_lock);
@@ -110,6 +110,15 @@ eb_lookup_objects(struct eb_objects *eb,
 
 		drm_gem_object_reference(&obj->base);
 		list_add_tail(&obj->exec_list, &eb->objects);
+	}
+	spin_unlock(&file->table_lock);
+
+	list_for_each_entry(obj,  &eb->objects, exec_list) {
+		struct i915_vma *vma;
+
+		vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
+		if (IS_ERR(vma))
+			return PTR_ERR(vma);
 
 		obj->exec_entry = &exec[i];
 		if (eb->and < 0) {
@@ -121,7 +130,6 @@ eb_lookup_objects(struct eb_objects *eb,
 				       &eb->buckets[handle & eb->and]);
 		}
 	}
-	spin_unlock(&file->table_lock);
 
 	return 0;
 }
@@ -672,7 +680,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 
 	/* reacquire the objects */
 	eb_reset(eb);
-	ret = eb_lookup_objects(eb, exec, args, file);
+	ret = eb_lookup_objects(eb, exec, args, vm, file);
 	if (ret)
 		goto err;
 
@@ -1009,7 +1017,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	}
 
 	/* Look up object handles */
-	ret = eb_lookup_objects(eb, exec, args, file);
+	ret = eb_lookup_objects(eb, exec, args, vm, file);
 	if (ret)
 		goto err;
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 25/29] drm/i915: Convert execbuf code to use vmas
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (23 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 24/29] drm/i915: create vmas at execbuf Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-06 20:43   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 26/29] drm/i915: Convert active API to VMA Ben Widawsky
                   ` (3 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This attempts to convert all the execbuf code to speak in vmas. Since
the execbuf code is very self contained it was a nice isolated
conversion.

The meat of the code is about turning eb_objects into eb_vma, and then
wiring up the rest of the code to use vmas instead of obj, vm pairs.

Unfortunately, to do this, we must move the exec_list link from the obj
structure. This list is reused in the eviction code, so we must also
modify the eviction code to make this work.

v2: Release table lock early, and two a 2 phase vma lookup to avoid
having to use a GFP_ATOMIC. (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  22 +-
 drivers/gpu/drm/i915/i915_gem.c            |   3 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  31 ++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 322 +++++++++++++++--------------
 4 files changed, 201 insertions(+), 177 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c0eb7fd..ee5164e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -550,6 +550,17 @@ struct i915_vma {
 	struct list_head mm_list;
 
 	struct list_head vma_link; /* Link in the object's VMA list */
+
+	/** This vma's place in the batchbuffer or on the eviction list */
+	struct list_head exec_list;
+
+	/**
+	 * Used for performing relocations during execbuffer insertion.
+	 */
+	struct hlist_node exec_node;
+	unsigned long exec_handle;
+	struct drm_i915_gem_exec_object2 *exec_entry;
+
 };
 
 struct i915_ctx_hang_stats {
@@ -1267,8 +1278,8 @@ struct drm_i915_gem_object {
 	struct list_head global_list;
 
 	struct list_head ring_list;
-	/** This object's place in the batchbuffer or on the eviction list */
-	struct list_head exec_list;
+	/** Used in execbuf to temporarily hold a ref */
+	struct list_head obj_exec_list;
 
 	/**
 	 * This is set if the object is on the active lists (has pending
@@ -1353,13 +1364,6 @@ struct drm_i915_gem_object {
 	void *dma_buf_vmapping;
 	int vmapping_count;
 
-	/**
-	 * Used for performing relocations during execbuffer insertion.
-	 */
-	struct hlist_node exec_node;
-	unsigned long exec_handle;
-	struct drm_i915_gem_exec_object2 *exec_entry;
-
 	struct intel_ring_buffer *ring;
 
 	/** Breadcrumb of last rendering to the buffer. */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 72bd53c..a4ba819 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3913,7 +3913,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 {
 	INIT_LIST_HEAD(&obj->global_list);
 	INIT_LIST_HEAD(&obj->ring_list);
-	INIT_LIST_HEAD(&obj->exec_list);
+	INIT_LIST_HEAD(&obj->obj_exec_list);
 	INIT_LIST_HEAD(&obj->vma_list);
 
 	obj->ops = ops;
@@ -4048,6 +4048,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 
 	INIT_LIST_HEAD(&vma->vma_link);
 	INIT_LIST_HEAD(&vma->mm_list);
+	INIT_LIST_HEAD(&vma->exec_list);
 	vma->vm = vm;
 	vma->obj = obj;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 425939b..8787588 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -37,7 +37,7 @@ mark_free(struct i915_vma *vma, struct list_head *unwind)
 	if (vma->obj->pin_count)
 		return false;
 
-	list_add(&vma->obj->exec_list, unwind);
+	list_add(&vma->exec_list, unwind);
 	return drm_mm_scan_add_block(&vma->node);
 }
 
@@ -49,7 +49,6 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct list_head eviction_list, unwind_list;
 	struct i915_vma *vma;
-	struct drm_i915_gem_object *obj;
 	int ret = 0;
 
 	trace_i915_gem_evict(dev, min_size, alignment, mappable);
@@ -104,14 +103,13 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 none:
 	/* Nothing found, clean up and bail out! */
 	while (!list_empty(&unwind_list)) {
-		obj = list_first_entry(&unwind_list,
-				       struct drm_i915_gem_object,
+		vma = list_first_entry(&unwind_list,
+				       struct i915_vma,
 				       exec_list);
-		vma = i915_gem_obj_to_vma(obj, vm);
 		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
-		list_del_init(&obj->exec_list);
+		list_del_init(&vma->exec_list);
 	}
 
 	/* We expect the caller to unpin, evict all and try again, or give up.
@@ -125,28 +123,27 @@ found:
 	 * temporary list. */
 	INIT_LIST_HEAD(&eviction_list);
 	while (!list_empty(&unwind_list)) {
-		obj = list_first_entry(&unwind_list,
-				       struct drm_i915_gem_object,
+		vma = list_first_entry(&unwind_list,
+				       struct i915_vma,
 				       exec_list);
-		vma = i915_gem_obj_to_vma(obj, vm);
 		if (drm_mm_scan_remove_block(&vma->node)) {
-			list_move(&obj->exec_list, &eviction_list);
-			drm_gem_object_reference(&obj->base);
+			list_move(&vma->exec_list, &eviction_list);
+			drm_gem_object_reference(&vma->obj->base);
 			continue;
 		}
-		list_del_init(&obj->exec_list);
+		list_del_init(&vma->exec_list);
 	}
 
 	/* Unbinding will emit any required flushes */
 	while (!list_empty(&eviction_list)) {
-		obj = list_first_entry(&eviction_list,
-				       struct drm_i915_gem_object,
+		vma = list_first_entry(&eviction_list,
+				       struct i915_vma,
 				       exec_list);
 		if (ret == 0)
-			ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));
+			ret = i915_vma_unbind(vma);
 
-		list_del_init(&obj->exec_list);
-		drm_gem_object_unreference(&obj->base);
+		list_del_init(&vma->exec_list);
+		drm_gem_object_unreference(&vma->obj->base);
 	}
 
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3f17a55..1c9d504 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -33,24 +33,24 @@
 #include "intel_drv.h"
 #include <linux/dma_remapping.h>
 
-struct eb_objects {
-	struct list_head objects;
+struct eb_vmas {
+	struct list_head vmas;
 	int and;
 	union {
-		struct drm_i915_gem_object *lut[0];
+		struct i915_vma *lut[0];
 		struct hlist_head buckets[0];
 	};
 };
 
-static struct eb_objects *
-eb_create(struct drm_i915_gem_execbuffer2 *args)
+static struct eb_vmas *
+eb_create(struct drm_i915_gem_execbuffer2 *args, struct i915_address_space *vm)
 {
-	struct eb_objects *eb = NULL;
+	struct eb_vmas *eb = NULL;
 
 	if (args->flags & I915_EXEC_HANDLE_LUT) {
 		int size = args->buffer_count;
-		size *= sizeof(struct drm_i915_gem_object *);
-		size += sizeof(struct eb_objects);
+		size *= sizeof(struct i915_vma *);
+		size += sizeof(struct eb_vmas);
 		eb = kmalloc(size, GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
 	}
 
@@ -61,7 +61,7 @@ eb_create(struct drm_i915_gem_execbuffer2 *args)
 		while (count > 2*size)
 			count >>= 1;
 		eb = kzalloc(count*sizeof(struct hlist_head) +
-			     sizeof(struct eb_objects),
+			     sizeof(struct eb_vmas),
 			     GFP_TEMPORARY);
 		if (eb == NULL)
 			return eb;
@@ -70,72 +70,97 @@ eb_create(struct drm_i915_gem_execbuffer2 *args)
 	} else
 		eb->and = -args->buffer_count;
 
-	INIT_LIST_HEAD(&eb->objects);
+	INIT_LIST_HEAD(&eb->vmas);
 	return eb;
 }
 
 static void
-eb_reset(struct eb_objects *eb)
+eb_reset(struct eb_vmas *eb)
 {
 	if (eb->and >= 0)
 		memset(eb->buckets, 0, (eb->and+1)*sizeof(struct hlist_head));
 }
 
 static int
-eb_lookup_objects(struct eb_objects *eb,
-		  struct drm_i915_gem_exec_object2 *exec,
-		  const struct drm_i915_gem_execbuffer2 *args,
-		  struct i915_address_space *vm,
-		  struct drm_file *file)
+eb_lookup_vmas(struct eb_vmas *eb,
+	       struct drm_i915_gem_exec_object2 *exec,
+	       const struct drm_i915_gem_execbuffer2 *args,
+	       struct i915_address_space *vm,
+	       struct drm_file *file)
 {
 	struct drm_i915_gem_object *obj;
-	int i;
+	struct list_head objects;
+	int i, ret = 0;
 
+	INIT_LIST_HEAD(&objects);
 	spin_lock(&file->table_lock);
+	/* Grab a reference to the object and release the lock so we can lookup
+	 * or create the VMA without using GFP_ATOMIC */
 	for (i = 0; i < args->buffer_count; i++) {
 		obj = to_intel_bo(idr_find(&file->object_idr, exec[i].handle));
 		if (obj == NULL) {
 			spin_unlock(&file->table_lock);
 			DRM_DEBUG("Invalid object handle %d at index %d\n",
 				   exec[i].handle, i);
-			return -ENOENT;
+			ret = -ENOENT;
+			goto out;
 		}
 
-		if (!list_empty(&obj->exec_list)) {
+		if (!list_empty(&obj->obj_exec_list)) {
 			spin_unlock(&file->table_lock);
 			DRM_DEBUG("Object %p [handle %d, index %d] appears more than once in object list\n",
 				   obj, exec[i].handle, i);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto out;
 		}
 
 		drm_gem_object_reference(&obj->base);
-		list_add_tail(&obj->exec_list, &eb->objects);
+		list_add_tail(&obj->obj_exec_list, &objects);
 	}
 	spin_unlock(&file->table_lock);
 
-	list_for_each_entry(obj,  &eb->objects, exec_list) {
+	i = 0;
+	list_for_each_entry(obj, &objects, obj_exec_list) {
 		struct i915_vma *vma;
 
 		vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
-		if (IS_ERR(vma))
-			return PTR_ERR(vma);
+		if (IS_ERR(vma)) {
+			/* XXX: We don't need an error path fro vma because if
+			 * the vma was created just for this execbuf, object
+			 * unreference should kill it off.*/
+			DRM_DEBUG("Failed to lookup VMA\n");
+			ret = PTR_ERR(vma);
+			goto out;
+		}
+
+		list_add_tail(&vma->exec_list, &eb->vmas);
 
-		obj->exec_entry = &exec[i];
+		vma->exec_entry = &exec[i];
 		if (eb->and < 0) {
-			eb->lut[i] = obj;
+			eb->lut[i] = vma;
 		} else {
 			uint32_t handle = args->flags & I915_EXEC_HANDLE_LUT ? i : exec[i].handle;
-			obj->exec_handle = handle;
-			hlist_add_head(&obj->exec_node,
+			vma->exec_handle = handle;
+			hlist_add_head(&vma->exec_node,
 				       &eb->buckets[handle & eb->and]);
 		}
+		++i;
 	}
 
-	return 0;
+
+out:
+	while (!list_empty(&objects)) {
+		obj = list_first_entry(&objects,
+				       struct drm_i915_gem_object,
+				       obj_exec_list);
+		list_del_init(&obj->obj_exec_list);
+		if (ret)
+			drm_gem_object_unreference(&obj->base);
+	}
+	return ret;
 }
 
-static struct drm_i915_gem_object *
-eb_get_object(struct eb_objects *eb, unsigned long handle)
+static struct i915_vma *eb_get_vma(struct eb_vmas *eb, unsigned long handle)
 {
 	if (eb->and < 0) {
 		if (handle >= -eb->and)
@@ -147,27 +172,25 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
 
 		head = &eb->buckets[handle & eb->and];
 		hlist_for_each(node, head) {
-			struct drm_i915_gem_object *obj;
+			struct i915_vma *vma;
 
-			obj = hlist_entry(node, struct drm_i915_gem_object, exec_node);
-			if (obj->exec_handle == handle)
-				return obj;
+			vma = hlist_entry(node, struct i915_vma, exec_node);
+			if (vma->exec_handle == handle)
+				return vma;
 		}
 		return NULL;
 	}
 }
 
-static void
-eb_destroy(struct eb_objects *eb)
-{
-	while (!list_empty(&eb->objects)) {
-		struct drm_i915_gem_object *obj;
+static void eb_destroy(struct eb_vmas *eb) {
+	while (!list_empty(&eb->vmas)) {
+		struct i915_vma *vma;
 
-		obj = list_first_entry(&eb->objects,
-				       struct drm_i915_gem_object,
+		vma = list_first_entry(&eb->vmas,
+				       struct i915_vma,
 				       exec_list);
-		list_del_init(&obj->exec_list);
-		drm_gem_object_unreference(&obj->base);
+		list_del_init(&vma->exec_list);
+		drm_gem_object_unreference(&vma->obj->base);
 	}
 	kfree(eb);
 }
@@ -181,22 +204,24 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
 
 static int
 i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
-				   struct eb_objects *eb,
+				   struct eb_vmas *eb,
 				   struct drm_i915_gem_relocation_entry *reloc,
 				   struct i915_address_space *vm)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_gem_object *target_obj;
 	struct drm_i915_gem_object *target_i915_obj;
+	struct i915_vma *target_vma;
 	uint32_t target_offset;
 	int ret = -EINVAL;
 
 	/* we've already hold a reference to all valid objects */
-	target_obj = &eb_get_object(eb, reloc->target_handle)->base;
-	if (unlikely(target_obj == NULL))
+	target_vma = eb_get_vma(eb, reloc->target_handle);
+	if (unlikely(target_vma == NULL))
 		return -ENOENT;
+	target_i915_obj = target_vma->obj;
+	target_obj = &target_vma->obj->base;
 
-	target_i915_obj = to_intel_bo(target_obj);
 	target_offset = i915_gem_obj_ggtt_offset(target_i915_obj);
 
 	/* Sandybridge PPGTT errata: We need a global gtt mapping for MI and
@@ -305,14 +330,13 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
-				    struct eb_objects *eb,
-				    struct i915_address_space *vm)
+i915_gem_execbuffer_relocate_vma(struct i915_vma *vma,
+				 struct eb_vmas *eb)
 {
 #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
 	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
 	struct drm_i915_gem_relocation_entry __user *user_relocs;
-	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	int remain, ret;
 
 	user_relocs = to_user_ptr(entry->relocs_ptr);
@@ -331,8 +355,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 		do {
 			u64 offset = r->presumed_offset;
 
-			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
-								 vm);
+			ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, r,
+								 vma->vm);
 			if (ret)
 				return ret;
 
@@ -353,17 +377,16 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
-					 struct eb_objects *eb,
-					 struct drm_i915_gem_relocation_entry *relocs,
-					 struct i915_address_space *vm)
+i915_gem_execbuffer_relocate_vma_slow(struct i915_vma *vma,
+				      struct eb_vmas *eb,
+				      struct drm_i915_gem_relocation_entry *relocs)
 {
-	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+	const struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	int i, ret;
 
 	for (i = 0; i < entry->relocation_count; i++) {
-		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
-							 vm);
+		ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, &relocs[i],
+							 vma->vm);
 		if (ret)
 			return ret;
 	}
@@ -372,10 +395,10 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate(struct eb_objects *eb,
+i915_gem_execbuffer_relocate(struct eb_vmas *eb,
 			     struct i915_address_space *vm)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	int ret = 0;
 
 	/* This is the fast path and we cannot handle a pagefault whilst
@@ -386,8 +409,8 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb,
 	 * lockdep complains vehemently.
 	 */
 	pagefault_disable();
-	list_for_each_entry(obj, &eb->objects, exec_list) {
-		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
+	list_for_each_entry(vma, &eb->vmas, exec_list) {
+		ret = i915_gem_execbuffer_relocate_vma(vma, eb);
 		if (ret)
 			break;
 	}
@@ -400,31 +423,31 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb,
 #define  __EXEC_OBJECT_HAS_FENCE (1<<30)
 
 static int
-need_reloc_mappable(struct drm_i915_gem_object *obj)
+need_reloc_mappable(struct i915_vma *vma)
 {
-	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
-	return entry->relocation_count && !use_cpu_reloc(obj);
+	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
+	return entry->relocation_count && !use_cpu_reloc(vma->obj);
 }
 
 static int
-i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
-				   struct intel_ring_buffer *ring,
-				   struct i915_address_space *vm,
-				   bool *need_reloc)
+i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
+				struct intel_ring_buffer *ring,
+				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
+	struct drm_i915_gem_object *obj = vma->obj;
 	int ret;
 
 	need_fence =
 		has_fenced_gpu_access &&
 		entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 		obj->tiling_mode != I915_TILING_NONE;
-	need_mappable = need_fence || need_reloc_mappable(obj);
+	need_mappable = need_fence || need_reloc_mappable(vma);
 
-	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
+	ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, need_mappable,
 				  false);
 	if (ret)
 		return ret;
@@ -452,8 +475,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->has_aliasing_ppgtt_mapping = 1;
 	}
 
-	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
-		entry->offset = i915_gem_obj_offset(obj, vm);
+	if (entry->offset != vma->node.start) {
+		entry->offset = vma->node.start;
 		*need_reloc = true;
 	}
 
@@ -470,61 +493,60 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
+i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma)
 {
 	struct drm_i915_gem_exec_object2 *entry;
 
-	if (!i915_gem_obj_bound_any(obj))
+	if (!drm_mm_node_allocated(&vma->node))
 		return;
 
-	entry = obj->exec_entry;
+	entry = vma->exec_entry;
 
 	if (entry->flags & __EXEC_OBJECT_HAS_FENCE)
-		i915_gem_object_unpin_fence(obj);
+		i915_gem_object_unpin_fence(vma->obj);
 
 	if (entry->flags & __EXEC_OBJECT_HAS_PIN)
-		i915_gem_object_unpin(obj);
+		i915_gem_object_unpin(vma->obj);
 
 	entry->flags &= ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN);
 }
 
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
-			    struct list_head *objects,
-			    struct i915_address_space *vm,
+			    struct list_head *vmas,
 			    bool *need_relocs)
 {
 	struct drm_i915_gem_object *obj;
-	struct list_head ordered_objects;
+	struct i915_vma *vma;
+	struct list_head ordered_vmas;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	int retry;
 
-	INIT_LIST_HEAD(&ordered_objects);
-	while (!list_empty(objects)) {
+	INIT_LIST_HEAD(&ordered_vmas);
+	while (!list_empty(vmas)) {
 		struct drm_i915_gem_exec_object2 *entry;
 		bool need_fence, need_mappable;
 
-		obj = list_first_entry(objects,
-				       struct drm_i915_gem_object,
-				       exec_list);
-		entry = obj->exec_entry;
+		vma = list_first_entry(vmas, struct i915_vma, exec_list);
+		obj = vma->obj;
+		entry = vma->exec_entry;
 
 		need_fence =
 			has_fenced_gpu_access &&
 			entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 			obj->tiling_mode != I915_TILING_NONE;
-		need_mappable = need_fence || need_reloc_mappable(obj);
+		need_mappable = need_fence || need_reloc_mappable(vma);
 
 		if (need_mappable)
-			list_move(&obj->exec_list, &ordered_objects);
+			list_move(&vma->exec_list, &ordered_vmas);
 		else
-			list_move_tail(&obj->exec_list, &ordered_objects);
+			list_move_tail(&vma->exec_list, &ordered_vmas);
 
 		obj->base.pending_read_domains = I915_GEM_GPU_DOMAINS & ~I915_GEM_DOMAIN_COMMAND;
 		obj->base.pending_write_domain = 0;
 		obj->pending_fenced_gpu_access = false;
 	}
-	list_splice(&ordered_objects, objects);
+	list_splice(&ordered_vmas, vmas);
 
 	/* Attempt to pin all of the buffers into the GTT.
 	 * This is done in 3 phases:
@@ -543,47 +565,47 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 		int ret = 0;
 
 		/* Unbind any ill-fitting objects or pin. */
-		list_for_each_entry(obj, objects, exec_list) {
-			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+		list_for_each_entry(vma, vmas, exec_list) {
+			struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 			bool need_fence, need_mappable;
-			u32 obj_offset;
 
-			if (!i915_gem_obj_bound(obj, vm))
+			obj = vma->obj;
+
+			if (!drm_mm_node_allocated(&vma->node))
 				continue;
 
-			obj_offset = i915_gem_obj_offset(obj, vm);
 			need_fence =
 				has_fenced_gpu_access &&
 				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 				obj->tiling_mode != I915_TILING_NONE;
-			need_mappable = need_fence || need_reloc_mappable(obj);
+			need_mappable = need_fence || need_reloc_mappable(vma);
 
 			BUG_ON((need_mappable || need_fence) &&
-			       !i915_is_ggtt(vm));
+			       !i915_is_ggtt(vma->vm));
 
 			if ((entry->alignment &&
-			     obj_offset & (entry->alignment - 1)) ||
+			     vma->node.start & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
-				ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));
+				ret = i915_vma_unbind(vma);
 			else
-				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
+				ret = i915_gem_execbuffer_reserve_vma(vma, ring, need_relocs);
 			if (ret)
 				goto err;
 		}
 
 		/* Bind fresh objects */
-		list_for_each_entry(obj, objects, exec_list) {
-			if (i915_gem_obj_bound(obj, vm))
+		list_for_each_entry(vma, vmas, exec_list) {
+			if (drm_mm_node_allocated(&vma->node))
 				continue;
 
-			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
+			ret = i915_gem_execbuffer_reserve_vma(vma, ring, need_relocs);
 			if (ret)
 				goto err;
 		}
 
 err:		/* Decrement pin count for bound objects */
-		list_for_each_entry(obj, objects, exec_list)
-			i915_gem_execbuffer_unreserve_object(obj);
+		list_for_each_entry(vma, vmas, exec_list)
+			i915_gem_execbuffer_unreserve_vma(vma);
 
 		if (ret != -ENOSPC || retry++)
 			return ret;
@@ -599,24 +621,27 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 				  struct drm_i915_gem_execbuffer2 *args,
 				  struct drm_file *file,
 				  struct intel_ring_buffer *ring,
-				  struct eb_objects *eb,
-				  struct drm_i915_gem_exec_object2 *exec,
-				  struct i915_address_space *vm)
+				  struct eb_vmas *eb,
+				  struct drm_i915_gem_exec_object2 *exec)
 {
 	struct drm_i915_gem_relocation_entry *reloc;
-	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	bool need_relocs;
 	int *reloc_offset;
 	int i, total, ret;
 	int count = args->buffer_count;
 
+	if (WARN_ON(list_empty(&eb->vmas)))
+		return 0;
+
+	vm = list_first_entry(&eb->vmas, struct i915_vma, exec_list)->vm;
+
 	/* We may process another execbuffer during the unlock... */
-	while (!list_empty(&eb->objects)) {
-		obj = list_first_entry(&eb->objects,
-				       struct drm_i915_gem_object,
-				       exec_list);
-		list_del_init(&obj->exec_list);
-		drm_gem_object_unreference(&obj->base);
+	while (!list_empty(&eb->vmas)) {
+		vma = list_first_entry(&eb->vmas, struct i915_vma, exec_list);
+		list_del_init(&vma->exec_list);
+		drm_gem_object_unreference(&vma->obj->base);
 	}
 
 	mutex_unlock(&dev->struct_mutex);
@@ -680,20 +705,19 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 
 	/* reacquire the objects */
 	eb_reset(eb);
-	ret = eb_lookup_objects(eb, exec, args, vm, file);
+	ret = eb_lookup_vmas(eb, exec, args, vm, file);
 	if (ret)
 		goto err;
 
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->vmas, &need_relocs);
 	if (ret)
 		goto err;
 
-	list_for_each_entry(obj, &eb->objects, exec_list) {
-		int offset = obj->exec_entry - exec;
-		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
-							       reloc + reloc_offset[offset],
-							       vm);
+	list_for_each_entry(vma, &eb->vmas, exec_list) {
+		int offset = vma->exec_entry - exec;
+		ret = i915_gem_execbuffer_relocate_vma_slow(vma, eb,
+							    reloc + reloc_offset[offset]);
 		if (ret)
 			goto err;
 	}
@@ -712,21 +736,21 @@ err:
 
 static int
 i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring,
-				struct list_head *objects)
+				struct list_head *vmas)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	uint32_t flush_domains = 0;
 	int ret;
 
-	list_for_each_entry(obj, objects, exec_list) {
-		ret = i915_gem_object_sync(obj, ring);
+	list_for_each_entry(vma, vmas, exec_list) {
+		ret = i915_gem_object_sync(vma->obj, ring);
 		if (ret)
 			return ret;
 
-		if (obj->base.write_domain & I915_GEM_DOMAIN_CPU)
-			i915_gem_clflush_object(obj);
+		if (vma->obj->base.write_domain & I915_GEM_DOMAIN_CPU)
+			i915_gem_clflush_object(vma->obj);
 
-		flush_domains |= obj->base.write_domain;
+		flush_domains |= vma->obj->base.write_domain;
 	}
 
 	if (flush_domains & I915_GEM_DOMAIN_CPU)
@@ -793,13 +817,13 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
 }
 
 static void
-i915_gem_execbuffer_move_to_active(struct list_head *objects,
-				   struct i915_address_space *vm,
+i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 				   struct intel_ring_buffer *ring)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 
-	list_for_each_entry(obj, objects, exec_list) {
+	list_for_each_entry(vma, vmas, exec_list) {
+		struct drm_i915_gem_object *obj = vma->obj;
 		u32 old_read = obj->base.read_domains;
 		u32 old_write = obj->base.write_domain;
 
@@ -810,7 +834,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
 		/* FIXME: This lookup gets fixed later <-- danvet */
-		list_move_tail(&i915_gem_obj_to_vma(obj, vm)->mm_list, &vm->active_list);
+		list_move_tail(&vma->mm_list, &vma->vm->active_list);
 		i915_gem_object_move_to_active(obj, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
@@ -869,7 +893,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct eb_objects *eb;
+	struct eb_vmas *eb;
 	struct drm_i915_gem_object *batch_obj;
 	struct drm_clip_rect *cliprects = NULL;
 	struct intel_ring_buffer *ring;
@@ -1009,7 +1033,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		goto pre_mutex_err;
 	}
 
-	eb = eb_create(args);
+	eb = eb_create(args, vm);
 	if (eb == NULL) {
 		mutex_unlock(&dev->struct_mutex);
 		ret = -ENOMEM;
@@ -1017,18 +1041,16 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	}
 
 	/* Look up object handles */
-	ret = eb_lookup_objects(eb, exec, args, vm, file);
+	ret = eb_lookup_vmas(eb, exec, args, vm, file);
 	if (ret)
 		goto err;
 
 	/* take note of the batch buffer before we might reorder the lists */
-	batch_obj = list_entry(eb->objects.prev,
-			       struct drm_i915_gem_object,
-			       exec_list);
+	batch_obj = list_entry(eb->vmas.prev, struct i915_vma, exec_list)->obj;
 
 	/* Move the objects en-masse into the GTT, evicting if necessary. */
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->vmas, &need_relocs);
 	if (ret)
 		goto err;
 
@@ -1038,7 +1060,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	if (ret) {
 		if (ret == -EFAULT) {
 			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
-								eb, exec, vm);
+								eb, exec);
 			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 		}
 		if (ret)
@@ -1060,7 +1082,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
 		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
 
-	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->objects);
+	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
 		goto err;
 
@@ -1115,7 +1137,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
-	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
+	i915_gem_execbuffer_move_to_active(&eb->vmas, ring);
 	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
 
 err:
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 26/29] drm/i915: Convert active API to VMA
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (24 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 25/29] drm/i915: Convert execbuf code to use vmas Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-06 20:47   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 27/29] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
                   ` (2 subsequent siblings)
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Even though we track object activeness and not VMA, because we have the
active_list be based on the VM, it makes the most sense to use VMAs in
the APIs.

NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
now, leave them be.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  5 ++---
 drivers/gpu/drm/i915/i915_gem.c            | 11 +++++++++--
 drivers/gpu/drm/i915/i915_gem_context.c    |  8 ++++----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 +---
 4 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ee5164e..695f1e5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1735,9 +1735,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
-void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
-				    struct intel_ring_buffer *ring);
-
+void i915_vma_move_to_active(struct i915_vma *vma,
+			     struct intel_ring_buffer *ring);
 int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
 			 struct drm_mode_create_dumb *args);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a4ba819..24c1a91 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1866,11 +1866,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 	return 0;
 }
 
-void
+static void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 			       struct intel_ring_buffer *ring)
 {
-	struct drm_device *dev = obj->base.dev;
+	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 seqno = intel_ring_get_seqno(ring);
 
@@ -1905,6 +1905,13 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 }
 
+void i915_vma_move_to_active(struct i915_vma *vma,
+			     struct intel_ring_buffer *ring)
+{
+	list_move_tail(&vma->mm_list, &vma->vm->active_list);
+	return i915_gem_object_move_to_active(vma->obj, ring);
+}
+
 static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 88b0f52..147399c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -436,11 +436,11 @@ static int do_switch(struct i915_hw_context *to)
 	 * MI_SET_CONTEXT instead of when the next seqno has completed.
 	 */
 	if (from != NULL) {
-		struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
-		struct i915_address_space *ggtt = &dev_priv->gtt.base;
+		struct drm_i915_private *dev_priv = ring->dev->dev_private;
+		struct i915_vma *vma =
+			i915_gem_obj_to_vma(from->obj, &dev_priv->gtt.base);
 		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-		list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
-		i915_gem_object_move_to_active(from->obj, ring);
+		i915_vma_move_to_active(vma, ring);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
 		 * object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 1c9d504..b8bb7f5 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -833,9 +833,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 		obj->base.read_domains = obj->base.pending_read_domains;
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
-		/* FIXME: This lookup gets fixed later <-- danvet */
-		list_move_tail(&vma->mm_list, &vma->vm->active_list);
-		i915_gem_object_move_to_active(obj, ring);
+		i915_vma_move_to_active(vma, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 27/29] drm/i915: Add bind/unbind object functions to VM
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (25 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 26/29] drm/i915: Convert active API to VMA Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-01  0:00 ` [PATCH 28/29] drm/i915: Use the new vm [un]bind functions Ben Widawsky
  2013-08-01  0:00 ` [PATCH 29/29] drm/i915: eliminate vm->insert_entries() Ben Widawsky
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels
Use VMA for bind/unbind (Daniel, Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  69 +++++++++++++-----------
 drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
 2 files changed, 140 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 695f1e5..2849297 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -446,6 +446,36 @@ enum i915_cache_level {
 
 typedef uint32_t gen6_gtt_pte_t;
 
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+	struct drm_mm_node node;
+	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
+	struct list_head vma_link; /* Link in the object's VMA list */
+
+	/** This vma's place in the batchbuffer or on the eviction list */
+	struct list_head exec_list;
+
+	/**
+	 * Used for performing relocations during execbuffer insertion.
+	 */
+	struct hlist_node exec_node;
+	unsigned long exec_handle;
+	struct drm_i915_gem_exec_object2 *exec_entry;
+
+};
+
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
@@ -484,9 +514,18 @@ struct i915_address_space {
 	/* FIXME: Need a more generic return type */
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
+
+	/** Unmap an object from an address space. This usually consists of
+	 * setting the valid PTE entries to a reserved scratch page. */
+	void (*unbind_vma)(struct i915_vma *vma);
 	void (*clear_range)(struct i915_address_space *vm,
 			    unsigned int first_entry,
 			    unsigned int num_entries);
+	/* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+	void (*bind_vma)(struct i915_vma *vma,
+			 enum i915_cache_level cache_level,
+			 u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
 			       unsigned int first_entry,
@@ -533,36 +572,6 @@ struct i915_hw_ppgtt {
 	int (*enable)(struct drm_device *dev);
 };
 
-/**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
- *
- * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
-	struct drm_mm_node node;
-	struct drm_i915_gem_object *obj;
-	struct i915_address_space *vm;
-
-	/** This object's place on the active/inactive lists */
-	struct list_head mm_list;
-
-	struct list_head vma_link; /* Link in the object's VMA list */
-
-	/** This vma's place in the batchbuffer or on the eviction list */
-	struct list_head exec_list;
-
-	/**
-	 * Used for performing relocations during execbuffer insertion.
-	 */
-	struct hlist_node exec_node;
-	unsigned long exec_handle;
-	struct drm_i915_gem_exec_object2 *exec_entry;
-
-};
-
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index e9b269f..39ac266 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -55,6 +55,11 @@
 #define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
 #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
 
+static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
+				enum i915_cache_level cache_level,
+				u32 flags);
+static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
+
 static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
 				      enum i915_cache_level level)
 {
@@ -307,7 +312,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	}
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
+	ppgtt->base.unbind_vma = NULL;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.bind_vma = NULL;
 	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -414,6 +421,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 				   cache_level);
 }
 
+static void __always_unused
+gen6_ppgtt_bind_vma(struct i915_vma *vma,
+		    enum i915_cache_level cache_level,
+		    u32 flags)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	WARN_ON(flags);
+
+	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+}
+
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
@@ -422,6 +441,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 				obj->base.size >> PAGE_SHIFT);
 }
 
+static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	gen6_ppgtt_clear_range(vma->vm, entry,
+			       vma->obj->base.size >> PAGE_SHIFT);
+}
+
 extern int intel_iommu_gfx_mapped;
 /* Certain Gen5 chipsets require require idling the GPU before
  * unmapping anything from the GTT when VT-d is enabled.
@@ -570,6 +597,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 
 }
 
+static void i915_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 unused)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	vma->obj->has_global_gtt_mapping = 1;
+}
+
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
@@ -577,6 +617,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
+static void i915_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned int first = vma->node.start >> PAGE_SHIFT;
+	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	vma->obj->has_global_gtt_mapping = 0;
+	intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 flags)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
+	 * the global, just use aliasing */
+	if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
+	    !obj->has_global_gtt_mapping) {
+		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+					  vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+		return;
+	}
+
+	gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
+	obj->has_global_gtt_mapping = 1;
+
+	/* If put the mapping in the aliasing PPGTT as well as Global if we have
+	 * aliasing, but the user requested global. */
+	if (dev_priv->mm.aliasing_ppgtt) {
+		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+					  vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+	}
+}
 
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 			      enum i915_cache_level cache_level)
@@ -605,6 +685,23 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 	obj->has_global_gtt_mapping = 0;
 }
 
+static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	gen6_ggtt_clear_range(vma->vm, entry,
+			      vma->obj->base.size >> PAGE_SHIFT);
+	vma->obj->has_global_gtt_mapping = 0;
+	if (dev_priv->mm.aliasing_ppgtt && vma->obj->has_aliasing_ppgtt_mapping) {
+		gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
+				       entry,
+				       vma->obj->base.size >> PAGE_SHIFT);
+		vma->obj->has_aliasing_ppgtt_mapping = 0;
+	}
+}
+
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -838,7 +935,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
 		DRM_ERROR("Scratch setup failed\n");
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
 	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
 
 	return ret;
 }
@@ -870,7 +969,9 @@ static int i915_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
 	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
 
 	return 0;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 28/29] drm/i915: Use the new vm [un]bind functions
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (26 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 27/29] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  2013-08-06 20:58   ` Daniel Vetter
  2013-08-01  0:00 ` [PATCH 29/29] drm/i915: eliminate vm->insert_entries() Ben Widawsky
  28 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  9 ------
 drivers/gpu/drm/i915/i915_gem.c            | 31 ++++++++-----------
 drivers/gpu/drm/i915/i915_gem_context.c    |  8 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 ++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------------
 5 files changed, 34 insertions(+), 91 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2849297..a9c3110 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1938,17 +1938,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 24c1a91..1f35ae4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2631,12 +2631,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	trace_i915_vma_unbind(vma);
 
-	if (obj->has_global_gtt_mapping && i915_is_ggtt(vma->vm))
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma->vm->unbind_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3354,7 +3350,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3393,11 +3388,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
+		list_for_each_entry(vma, &obj->vma_list, vma_link)
+			vma->vm->bind_vma(vma, cache_level, 0);
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -3676,6 +3668,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
 	struct i915_vma *vma;
 	int ret;
 
@@ -3704,20 +3697,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->bind_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 147399c..10a5618 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -415,8 +416,11 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
+							   &dev_priv->gtt.base);
+		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b8bb7f5..4719e74 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -230,8 +230,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
+				 GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -434,11 +435,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 				struct intel_ring_buffer *ring,
 				bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
 	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
-	struct drm_i915_gem_object *obj = vma->obj;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
 	int ret;
 
 	need_fence =
@@ -467,14 +469,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != vma->node.start) {
 		entry->offset = vma->node.start;
 		*need_reloc = true;
@@ -485,9 +479,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vma->vm->bind_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -1077,8 +1069,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE &&
+	    !batch_obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
+		vm->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
+	}
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 39ac266..74b5077 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -412,15 +412,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
-}
-
 static void __always_unused
 gen6_ppgtt_bind_vma(struct i915_vma *vma,
 		    enum i915_cache_level cache_level,
@@ -433,14 +424,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -501,8 +484,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->bind_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -658,33 +643,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 29/29] drm/i915: eliminate vm->insert_entries()
  2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
                   ` (27 preceding siblings ...)
  2013-08-01  0:00 ` [PATCH 28/29] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-08-01  0:00 ` Ben Widawsky
  28 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-01  0:00 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 74b5077..4e6e176 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -314,8 +314,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->enable = gen6_ppgtt_enable;
 	ppgtt->base.unbind_vma = NULL;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
-	ppgtt->base.bind_vma = NULL;
 	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
+	ppgtt->base.bind_vma = NULL;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
 	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
@@ -569,19 +569,6 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 	readl(gtt_base);
 }
 
-
-static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     unsigned int pg_start,
-				     enum i915_cache_level cache_level)
-{
-	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
-		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
-
-	intel_gtt_insert_sg_entries(st, pg_start, flags);
-
-}
-
 static void i915_ggtt_bind_vma(struct i915_vma *vma,
 			       enum i915_cache_level cache_level,
 			       u32 unused)
@@ -894,7 +881,6 @@ static int gen6_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
 	dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
-	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
 	dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
 
 	return ret;
@@ -928,7 +914,6 @@ static int i915_gmch_probe(struct drm_device *dev,
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
 	dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
-	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
 	dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
 
 	return 0;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 13/29] drm/i915: clear domains for all objects on reset
  2013-08-01  0:00 ` [PATCH 13/29] drm/i915: clear domains for all objects on reset Ben Widawsky
@ 2013-08-03 10:59   ` Chris Wilson
  2013-08-03 22:24     ` Ben Widawsky
  2013-08-05 16:46   ` [PATCH 13/29] drm/i915: eliminate dead domain clearing " Ben Widawsky
  1 sibling, 1 reply; 70+ messages in thread
From: Chris Wilson @ 2013-08-03 10:59 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:06PM -0700, Ben Widawsky wrote:
> Simply iterating over 1 inactive list is insufficient for the way we now
> track inactive (1 list per address space). We could alternatively do
> this with bound + unbound lists, and an inactive check. To me, this way
> is a bit easier to understand.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b4c35f0..8ce3545 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2282,7 +2282,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
>  	int i;
> @@ -2293,8 +2293,9 @@ void i915_gem_reset(struct drm_device *dev)
>  	/* Move everything out of the GPU domains to ensure we do any
>  	 * necessary invalidation upon reuse.
>  	 */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;

This code is dead. Just remove it rather than port it to vma.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any
  2013-08-01  0:00 ` [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any Ben Widawsky
@ 2013-08-03 11:03   ` Chris Wilson
  2013-08-03 22:26     ` Ben Widawsky
  2013-08-06 18:43   ` Daniel Vetter
  1 sibling, 1 reply; 70+ messages in thread
From: Chris Wilson @ 2013-08-03 11:03 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:12PM -0700, Ben Widawsky wrote:
> In some places, we want to know if an object is bound in any address
> space, and not just the global GTT. This often applies when there is a
> single global resource (object, pages, etc.)
> 
> function                             |      reason
> --------------------------------------------------
> i915_gem_object_is_inactive          | global object
> i915_gem_object_put_pages            | object's pages
> 915_gem_object_unpin                 | global object
> i915_gem_execbuffer_unreserve_object | temporary until we plumb vma
> pread/pread                          | object's domain

Most of these we want to operate on ggtt, so this change looks confused
at best.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 02/29] drm/i915: Rework drop caches for checkpatch
  2013-07-31 23:59 ` [PATCH 02/29] drm/i915: Rework drop caches for checkpatch Ben Widawsky
@ 2013-08-03 11:32   ` Chris Wilson
  2013-08-03 22:10     ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Chris Wilson @ 2013-08-03 11:32 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 04:59:55PM -0700, Ben Widawsky wrote:
> With an upcoming change to bind, to make checkpatch happy and keep the
> code clean, we need to rework this code a bit.
> 
> This should have no functional impact.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index be69807..61ffa71 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1781,12 +1781,13 @@ i915_drop_caches_set(void *data, u64 val)
>  
>  	if (val & DROP_BOUND) {
>  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -					 mm_list)
> -			if (obj->pin_count == 0) {
> -				ret = i915_gem_object_unbind(obj);
> -				if (ret)
> -					goto unlock;
> -			}
> +					 mm_list) {
> +			if (obj->pin_count)
> +				continue;

Give me a newline here, and I'm sold.

> +			ret = i915_gem_object_unbind(obj);
> +			if (ret)
> +				goto unlock;
> +		}
>  	}

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 02/29] drm/i915: Rework drop caches for checkpatch
  2013-08-03 11:32   ` Chris Wilson
@ 2013-08-03 22:10     ` Ben Widawsky
  0 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-03 22:10 UTC (permalink / raw)
  To: Chris Wilson, Intel GFX

On Sat, Aug 03, 2013 at 12:32:35PM +0100, Chris Wilson wrote:
> On Wed, Jul 31, 2013 at 04:59:55PM -0700, Ben Widawsky wrote:
> > With an upcoming change to bind, to make checkpatch happy and keep the
> > code clean, we need to rework this code a bit.
> > 
> > This should have no functional impact.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c | 13 +++++++------
> >  1 file changed, 7 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index be69807..61ffa71 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -1781,12 +1781,13 @@ i915_drop_caches_set(void *data, u64 val)
> >  
> >  	if (val & DROP_BOUND) {
> >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > -					 mm_list)
> > -			if (obj->pin_count == 0) {
> > -				ret = i915_gem_object_unbind(obj);
> > -				if (ret)
> > -					goto unlock;
> > -			}
> > +					 mm_list) {
> > +			if (obj->pin_count)
> > +				continue;
> 
> Give me a newline here, and I'm sold.
> 

Got it. As you may have noticed, this was already fixed in patch 20 of
the series.

> > +			ret = i915_gem_object_unbind(obj);
> > +			if (ret)
> > +				goto unlock;
> > +		}
> >  	}
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 13/29] drm/i915: clear domains for all objects on reset
  2013-08-03 10:59   ` Chris Wilson
@ 2013-08-03 22:24     ` Ben Widawsky
  2013-08-05  9:52       ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-03 22:24 UTC (permalink / raw)
  To: Chris Wilson, Intel GFX

On Sat, Aug 03, 2013 at 11:59:42AM +0100, Chris Wilson wrote:
> On Wed, Jul 31, 2013 at 05:00:06PM -0700, Ben Widawsky wrote:
> > Simply iterating over 1 inactive list is insufficient for the way we now
> > track inactive (1 list per address space). We could alternatively do
> > this with bound + unbound lists, and an inactive check. To me, this way
> > is a bit easier to understand.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index b4c35f0..8ce3545 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2282,7 +2282,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> >  void i915_gem_reset(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> >  	struct intel_ring_buffer *ring;
> >  	int i;
> > @@ -2293,8 +2293,9 @@ void i915_gem_reset(struct drm_device *dev)
> >  	/* Move everything out of the GPU domains to ensure we do any
> >  	 * necessary invalidation upon reuse.
> >  	 */
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> 
> This code is dead. Just remove it rather than port it to vma.
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

Got it, and moved to the front of the series.

commit 8472f08863da69159aa0a7555836ca0511754877
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Sat Aug 3 15:22:17 2013 -0700

    drm/i915: eliminate dead domain clearing on reset
    
    The code itself is no longer accurate without updating once we have
    multiple address space since clearing the domains of every object
    requires scanning the inactive list for all VMs.
    
    "This code is dead. Just remove it rather than port it to vma." - Chris
    Wilson
    
    Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3a5d4ba..c7e3cee 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2277,12 +2277,6 @@ void i915_gem_reset(struct drm_device *dev)
        for_each_ring(ring, dev_priv, i)
                i915_gem_reset_ring_lists(dev_priv, ring);
 
-       /* Move everything out of the GPU domains to ensure we do any
-        * necessary invalidation upon reuse.
-        */
-       list_for_each_entry(obj, &vm->inactive_list, mm_list)
-               obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
-
        i915_gem_restore_fences(dev);
 }



-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any
  2013-08-03 11:03   ` Chris Wilson
@ 2013-08-03 22:26     ` Ben Widawsky
  0 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-03 22:26 UTC (permalink / raw)
  To: Chris Wilson, Intel GFX

On Sat, Aug 03, 2013 at 12:03:35PM +0100, Chris Wilson wrote:
> On Wed, Jul 31, 2013 at 05:00:12PM -0700, Ben Widawsky wrote:
> > In some places, we want to know if an object is bound in any address
> > space, and not just the global GTT. This often applies when there is a
> > single global resource (object, pages, etc.)
> > 
> > function                             |      reason
> > --------------------------------------------------
> > i915_gem_object_is_inactive          | global object
> > i915_gem_object_put_pages            | object's pages
> > 915_gem_object_unpin                 | global object
> > i915_gem_execbuffer_unreserve_object | temporary until we plumb vma
> > pread/pread                          | object's domain
> 
> Most of these we want to operate on ggtt, so this change looks confused
> at best.
> -Chris
> 

And at worst? I guess I should have explained it a bit differently - yes
the operations may very well be on the global GTT, but for the given
reason (in the table), we need to make sure we unbind (or assert the
object isn't bound) in another address space first.

Possibly I've misunderstood what you'd like me to do however.


-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 08/29] drm/i915: Rework __i915_gem_shrink
  2013-08-01  0:00 ` [PATCH 08/29] drm/i915: Rework __i915_gem_shrink Ben Widawsky
@ 2013-08-05  8:59   ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-05  8:59 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:01PM -0700, Ben Widawsky wrote:
> In order to do this for all VMs, it's convenient to rework the logic a
> bit. This should have no functional impact.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

I didn't read ahead but I guess the actual shrink code here will move to
the unbound list like the count code beforehand? Just for symmetry I think
that'd be worth it. Patch applied since it looks neater ;-)
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3aaf875..3ce9d0d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1693,9 +1693,14 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  	}
>  
>  	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> -		    i915_gem_object_unbind(obj) == 0 &&
> -		    i915_gem_object_put_pages(obj) == 0) {
> +
> +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> +			continue;
> +
> +		if (i915_gem_object_unbind(obj))
> +			continue;
> +
> +		if (!i915_gem_object_put_pages(obj)) {
>  			count += obj->base.size >> PAGE_SHIFT;
>  			if (count >= target)
>  				return count;
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 09/29] drm/i915: thread address space through execbuf
  2013-08-01  0:00 ` [PATCH 09/29] drm/i915: thread address space through execbuf Ben Widawsky
@ 2013-08-05  9:39   ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-05  9:39 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:02PM -0700, Ben Widawsky wrote:
> This represents the first half of hooking up VMs to execbuf. Here we
> basically pass an address space all around to the different internal
> functions. It should be much more readable, and have less risk than the
> second half, which begins switching over to using VMAs instead of an
> obj,vm.
> 
> The overall series echoes this style of, "add a VM, then make it smart
> later"
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---

[snip]

> @@ -477,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  			    struct list_head *objects,
> +			    struct i915_address_space *vm,
>  			    bool *need_relocs)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -531,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  		list_for_each_entry(obj, objects, exec_list) {
>  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  			bool need_fence, need_mappable;
> +			u32 obj_offset;
>  
> -			if (!i915_gem_obj_ggtt_bound(obj))
> +			if (!i915_gem_obj_bound(obj, vm))
>  				continue;
>  
> +			obj_offset = i915_gem_obj_offset(obj, vm);
>  			need_fence =
>  				has_fenced_gpu_access &&
>  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>  				obj->tiling_mode != I915_TILING_NONE;
>  			need_mappable = need_fence || need_reloc_mappable(obj);
>  
> +			BUG_ON((need_mappable || need_fence) &&
> +			       !i915_is_ggtt(vm));

No BUG_ON if the error isn't fatal, please. I've fixed this up to a
WARN_ON while applying.

I know that you prefer BUG_ON for developing, but when we hit one of these
in the wild it's a royal pain to debug. But please either switch to WARNs
when submitting the patches or (imo preferred) get into the habit of
grepping dmesg for backtraces when testing.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 10/29] drm/i915: make caching operate on all address spaces
  2013-08-01  0:00 ` [PATCH 10/29] drm/i915: make caching operate on all address spaces Ben Widawsky
@ 2013-08-05  9:41   ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-05  9:41 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:03PM -0700, Ben Widawsky wrote:
> For now, objects will maintain the same cache levels amongst all address
> spaces. This is to limit the risk of bugs, as playing with cacheability
> in the different domains can be very error prone.
> 
> In the future, it may be optimal to allow setting domains per VMA (ie.
> an object bound into an address space).
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3ce9d0d..adb0a18 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3308,7 +3308,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
> +	struct i915_vma *vma;
>  	int ret;
>  
>  	if (obj->cache_level == cache_level)
> @@ -3319,13 +3319,17 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  		return -EBUSY;
>  	}
>  
> -	if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> -		ret = i915_gem_object_unbind(obj);
> -		if (ret)
> -			return ret;
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> +			ret = i915_gem_object_unbind(obj);
> +			if (ret)
> +				return ret;
> +
> +			break;
> +		}
>  	}
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> +	if (i915_gem_obj_bound_any(obj)) {

Hm, I guess this will change later on to a for_each_vma loop?

Patch applied meanwhile.
-Daniel
>  		ret = i915_gem_object_finish_gpu(obj);
>  		if (ret)
>  			return ret;
> @@ -3347,8 +3351,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  		if (obj->has_aliasing_ppgtt_mapping)
>  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
>  					       obj, cache_level);
> -
> -		i915_gem_obj_to_vma(obj, &dev_priv->gtt.base)->node.color = cache_level;
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -3374,6 +3376,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  						    old_write_domain);
>  	}
>  
> +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> +		vma->node.color = cache_level;
>  	obj->cache_level = cache_level;
>  	i915_gem_verify_gtt(dev);
>  	return 0;
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 11/29] drm/i915: BUG_ON put_pages later
  2013-08-01  0:00 ` [PATCH 11/29] drm/i915: BUG_ON put_pages later Ben Widawsky
@ 2013-08-05  9:42   ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-05  9:42 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:04PM -0700, Ben Widawsky wrote:
> With multiple VMs, the eviction code benefits from being able to blindly
> put pages without needing to know if there are any entities still
> holding on to those pages. As such it's preferable to return the -EBUSY
> before the BUG.
> 
> Eviction code is the only user for now, but overall it makes sense
> anyway, IMO.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index adb0a18..dbf72d5 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1654,11 +1654,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>  	if (obj->pages == NULL)
>  		return 0;
>  
> -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> -
>  	if (obj->pages_pin_count)
>  		return -EBUSY;
>  
> +	BUG_ON(i915_gem_obj_ggtt_bound(obj));

Hm, shouldn't this be a bound_any eventually? Again I'm too layz to check
the end result, just noting my thoughs here ;-)
-Daniel

> +
>  	/* ->put_pages might need to allocate memory for the bit17 swizzle
>  	 * array, hence protect them from being reaped by removing them from gtt
>  	 * lists early. */
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 13/29] drm/i915: clear domains for all objects on reset
  2013-08-03 22:24     ` Ben Widawsky
@ 2013-08-05  9:52       ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-05  9:52 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sat, Aug 03, 2013 at 03:24:47PM -0700, Ben Widawsky wrote:
> On Sat, Aug 03, 2013 at 11:59:42AM +0100, Chris Wilson wrote:
> > On Wed, Jul 31, 2013 at 05:00:06PM -0700, Ben Widawsky wrote:
> > > Simply iterating over 1 inactive list is insufficient for the way we now
> > > track inactive (1 list per address space). We could alternatively do
> > > this with bound + unbound lists, and an inactive check. To me, this way
> > > is a bit easier to understand.
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 7 ++++---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index b4c35f0..8ce3545 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -2282,7 +2282,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > >  void i915_gem_reset(struct drm_device *dev)
> > >  {
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj;
> > >  	struct intel_ring_buffer *ring;
> > >  	int i;
> > > @@ -2293,8 +2293,9 @@ void i915_gem_reset(struct drm_device *dev)
> > >  	/* Move everything out of the GPU domains to ensure we do any
> > >  	 * necessary invalidation upon reuse.
> > >  	 */
> > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > 
> > This code is dead. Just remove it rather than port it to vma.
> > -Chris
> > 
> > -- 
> > Chris Wilson, Intel Open Source Technology Centre
> 
> Got it, and moved to the front of the series.
> 
> commit 8472f08863da69159aa0a7555836ca0511754877
> Author: Ben Widawsky <ben@bwidawsk.net>
> Date:   Sat Aug 3 15:22:17 2013 -0700
> 
>     drm/i915: eliminate dead domain clearing on reset
>     
>     The code itself is no longer accurate without updating once we have
>     multiple address space since clearing the domains of every object
>     requires scanning the inactive list for all VMs.
>     
>     "This code is dead. Just remove it rather than port it to vma." - Chris
>     Wilson
>     
>     Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
>     Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>

That's not a properly formatted patch, so I've stopped merging for now.

/me loves cheap excuses

But overall I _really_ like what the series looks now, I can dwell in the
cozy feeling that I actually understand what's going on. So if the name of
the game is to keep your maintainer happy I think the goal is unlocked ;-)

Cheers, Daniel

> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3a5d4ba..c7e3cee 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2277,12 +2277,6 @@ void i915_gem_reset(struct drm_device *dev)
>         for_each_ring(ring, dev_priv, i)
>                 i915_gem_reset_ring_lists(dev_priv, ring);
>  
> -       /* Move everything out of the GPU domains to ensure we do any
> -        * necessary invalidation upon reuse.
> -        */
> -       list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -               obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> -
>         i915_gem_restore_fences(dev);
>  }
> 
> 
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH 13/29] drm/i915: eliminate dead domain clearing on reset
  2013-08-01  0:00 ` [PATCH 13/29] drm/i915: clear domains for all objects on reset Ben Widawsky
  2013-08-03 10:59   ` Chris Wilson
@ 2013-08-05 16:46   ` Ben Widawsky
  2013-08-05 17:13     ` Daniel Vetter
  1 sibling, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-05 16:46 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

The code itself is no longer accurate without updating once we have
multiple address space since clearing the domains of every object
requires scanning the inactive list for all VMs.

"This code is dead. Just remove it rather than port it to vma." - Chris
Wilson

Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d37f5c0..3b9558f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2290,20 +2290,12 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
-	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
 	int i;
 
 	for_each_ring(ring, dev_priv, i)
 		i915_gem_reset_ring_lists(dev_priv, ring);
 
-	/* Move everything out of the GPU domains to ensure we do any
-	 * necessary invalidation upon reuse.
-	 */
-	list_for_each_entry(obj, &vm->inactive_list, mm_list)
-		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
-
 	i915_gem_restore_fences(dev);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 13/29] drm/i915: eliminate dead domain clearing on reset
  2013-08-05 16:46   ` [PATCH 13/29] drm/i915: eliminate dead domain clearing " Ben Widawsky
@ 2013-08-05 17:13     ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-05 17:13 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Mon, Aug 05, 2013 at 09:46:44AM -0700, Ben Widawsky wrote:
> The code itself is no longer accurate without updating once we have
> multiple address space since clearing the domains of every object
> requires scanning the inactive list for all VMs.
> 
> "This code is dead. Just remove it rather than port it to vma." - Chris
> Wilson
> 
> Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>

Thanks for resend, slurped into dinq. I'd like to take a little break here
now and resume merging tomorrow - I've merge over 50 patches just today
;-)
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d37f5c0..3b9558f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2290,20 +2290,12 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> -	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
>  	int i;
>  
>  	for_each_ring(ring, dev_priv, i)
>  		i915_gem_reset_ring_lists(dev_priv, ring);
>  
> -	/* Move everything out of the GPU domains to ensure we do any
> -	 * necessary invalidation upon reuse.
> -	 */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> -
>  	i915_gem_restore_fences(dev);
>  }
>  
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 14/29] drm/i915: Restore PDEs on gtt restore
  2013-08-01  0:00 ` [PATCH 14/29] drm/i915: Restore PDEs on gtt restore Ben Widawsky
@ 2013-08-06 18:14   ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 18:14 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:07PM -0700, Ben Widawsky wrote:
> I can't remember why I added this initially.
> 
> TODO: Throw out if not necessary
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Oops. I vaguely remember that there was some shuffling around with where
the ppgtt pdes are written once we have real ppgtt. Since the aliasing
ppgtt isn't a real ppgtt it could be that the write_pdes call in
gen6_ppgtt_enable isn't executed any more and hence we need to do this
manually.

Otoh it's looks like we should just call gen6_ppgtt_enable in such a case.
So I can't really help with a reason why we need this. I'm pretty sure
that we don't need it right now, so I'll punt for now. But please keep it
in mind when rebasing the real ppgtt stuff on top, once the VMA patches
are merged to dinq.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 1ed9acb..e9b269f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -470,6 +470,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  				       dev_priv->gtt.base.start / PAGE_SIZE,
>  				       dev_priv->gtt.base.total / PAGE_SIZE);
>  
> +	if (dev_priv->mm.aliasing_ppgtt)
> +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		i915_gem_clflush_object(obj);
>  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 17/29] drm/i915: plumb VM into bind/unbind code
  2013-08-01  0:00 ` [PATCH 17/29] drm/i915: plumb VM into bind/unbind code Ben Widawsky
@ 2013-08-06 18:29   ` Daniel Vetter
  2013-08-06 18:54     ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 18:29 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:10PM -0700, Ben Widawsky wrote:
> As alluded to in several patches, and it will be reiterated later... A
> VMA is an abstraction for a GEM BO bound into an address space.
> Therefore it stands to reason, that the existing bind, and unbind are
> the ones which will be the most impacted. This patch implements this,
> and updates all callers which weren't already updated in the series
> (because it was too messy).
> 
> This patch represents the bulk of an earlier, larger patch. I've pulled
> out a bunch of things by the request of Daniel. The history is preserved
> for posterity with the email convention of ">" One big change from the
> original patch aside from a bunch of cropping is I've created an
> i915_vma_unbind() function. That is because we always have the VMA
> anyway, and doing an extra lookup is useful. There is a caveat, we
> retain an i915_gem_object_ggtt_unbind, for the global cases which might
> not talk in VMAs.
> 
> > drm/i915: plumb VM into object operations
> >
> > This patch was formerly known as:
> > "drm/i915: Create VMAs (part 3) - plumbing"
> >
> > This patch adds a VM argument, bind/unbind, and the object
> > offset/size/color getters/setters. It preserves the old ggtt helper
> > functions because things still need, and will continue to need them.
> >
> > Some code will still need to be ported over after this.
> >
> > v2: Fix purge to pick an object and unbind all vmas
> > This was doable because of the global bound list change.
> >
> > v3: With the commit to actually pin/unpin pages in place, there is no
> > longer a need to check if unbind succeeded before calling put_pages().
> > Make put_pages only BUG() after checking pin count.
> >
> > v4: Rebased on top of the new hangcheck work by Mika
> > plumbed eb_destroy also
> > Many checkpatch related fixes
> >
> > v5: Very large rebase
> >
> > v6:
> > Change BUG_ON to WARN_ON (Daniel)
> > Rename vm to ggtt in preallocate stolen, since it is always ggtt when
> > dealing with stolen memory. (Daniel)
> > list_for_each will short-circuit already (Daniel)
> > remove superflous space (Daniel)
> > Use per object list of vmas (Daniel)
> > Make obj_bound_any() use obj_bound for each vm (Ben)
> > s/bind_to_gtt/bind_to_vm/ (Ben)
> >
> > Fixed up the inactive shrinker. As Daniel noticed the code could
> > potentially count the same object multiple times. While it's not
> > possible in the current case, since 1 object can only ever be bound into
> > 1 address space thus far - we may as well try to get something more
> > future proof in place now. With a prep patch before this to switch over
> > to using the bound list + inactive check, we're now able to carry that
> > forward for every address space an object is bound into.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

A bunch of comments below. Overall that patch is imo really easy to review
and the comments can all be addressed in follow-up patches (if at all).
But I think I've spotted a leak in object_pin.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |   2 +-
>  drivers/gpu/drm/i915/i915_drv.h            |   3 +-
>  drivers/gpu/drm/i915/i915_gem.c            | 134 +++++++++++++++++++----------
>  drivers/gpu/drm/i915/i915_gem_evict.c      |   4 +-
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   2 +-
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |   9 +-
>  drivers/gpu/drm/i915/i915_trace.h          |  37 ++++----
>  7 files changed, 120 insertions(+), 71 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index d6154cb..6d5ca85bd 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1796,7 +1796,7 @@ i915_drop_caches_set(void *data, u64 val)
>  					 mm_list) {
>  			if (obj->pin_count)
>  				continue;
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_ggtt_unbind(obj);
>  			if (ret)
>  				goto unlock;
>  		}
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index dbfffb2..0610588 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1700,7 +1700,8 @@ int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  				     bool map_and_fenceable,
>  				     bool nonblocking);
>  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> +int __must_check i915_vma_unbind(struct i915_vma *vma);
> +int __must_check i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
>  void i915_gem_lastclose(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 4b669e8..0cb36c2 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -38,10 +38,12 @@
>  
>  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
>  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -						    unsigned alignment,
> -						    bool map_and_fenceable,
> -						    bool nonblocking);
> +static __must_check int
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +			   struct i915_address_space *vm,
> +			   unsigned alignment,
> +			   bool map_and_fenceable,
> +			   bool nonblocking);
>  static int i915_gem_phys_pwrite(struct drm_device *dev,
>  				struct drm_i915_gem_object *obj,
>  				struct drm_i915_gem_pwrite *args,
> @@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		  bool purgeable_only)
>  {
>  	struct drm_i915_gem_object *obj, *next;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	long count = 0;
>  
>  	list_for_each_entry_safe(obj, next,
> @@ -1692,13 +1693,16 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		}
>  	}
>  
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> +				 global_list) {
> +		struct i915_vma *vma, *v;
>  
>  		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
>  			continue;
>  
> -		if (i915_gem_object_unbind(obj))
> -			continue;
> +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> +			if (i915_vma_unbind(vma))
> +				break;
>  
>  		if (!i915_gem_object_put_pages(obj)) {
>  			count += obj->base.size >> PAGE_SHIFT;
> @@ -2591,17 +2595,13 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
>  					    old_write_domain);
>  }
>  
> -/**
> - * Unbinds an object from the GTT aperture.
> - */
> -int
> -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> +int i915_vma_unbind(struct i915_vma *vma)
>  {
> +	struct drm_i915_gem_object *obj = vma->obj;
>  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> -	struct i915_vma *vma;
>  	int ret;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (list_empty(&vma->vma_link))
>  		return 0;

This smells like something which should never be the case. Add a WARN_ON
in a follow-up patch?

>  
>  	if (obj->pin_count)
> @@ -2624,7 +2624,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	if (ret)
>  		return ret;
>  
> -	trace_i915_gem_object_unbind(obj);
> +	trace_i915_vma_unbind(vma);
>  
>  	if (obj->has_global_gtt_mapping)
>  		i915_gem_gtt_unbind_object(obj);
> @@ -2639,7 +2639,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	obj->map_and_fenceable = true;
>  
> -	vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
>  	i915_gem_vma_destroy(vma);
>  
>  	/* Since the unbound list is global, only move to that list if
> @@ -2652,6 +2651,26 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	return 0;
>  }
>  
> +/**
> + * Unbinds an object from the global GTT aperture.
> + */
> +int
> +i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj)
> +{
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> +	struct i915_address_space *ggtt = &dev_priv->gtt.base;
> +
> +	if (!i915_gem_obj_ggtt_bound(obj));
> +		return 0;
> +
> +	if (obj->pin_count)
> +		return -EBUSY;
> +
> +	BUG_ON(obj->pages == NULL);
> +
> +	return i915_vma_unbind(i915_gem_obj_to_vma(obj, ggtt));
> +}
> +
>  int i915_gpu_idle(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> @@ -3069,18 +3088,18 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>   * Finds free space in the GTT aperture and binds the object there.
>   */
>  static int
> -i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -			    unsigned alignment,
> -			    bool map_and_fenceable,
> -			    bool nonblocking)
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +			   struct i915_address_space *vm,
> +			   unsigned alignment,
> +			   bool map_and_fenceable,
> +			   bool nonblocking)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 size, fence_size, fence_alignment, unfenced_alignment;
>  	bool mappable, fenceable;
> -	size_t gtt_max = map_and_fenceable ?
> -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> +	size_t gtt_max =
> +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
>  	struct i915_vma *vma;
>  	int ret;
>  
> @@ -3125,15 +3144,18 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  
>  	i915_gem_object_pin_pages(obj);
>  
> -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	/* FIXME: For now we only ever use 1 VMA per object */
> +	BUG_ON(!i915_is_ggtt(vm));

I'll change this to a WARN_ON for now ... My upfront apologies for the
rebase conflict.

> +	WARN_ON(!list_empty(&obj->vma_list));
> +
> +	vma = i915_gem_vma_create(obj, vm);
>  	if (IS_ERR(vma)) {
>  		i915_gem_object_unpin_pages(obj);
>  		return PTR_ERR(vma);
>  	}
>  
>  search_free:
> -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> -						  &vma->node,
> +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>  						  size, alignment,
>  						  obj->cache_level, 0, gtt_max);
>  	if (ret) {
> @@ -3158,18 +3180,25 @@ search_free:
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> -	list_add(&vma->vma_link, &obj->vma_list);
> +
> +	/* Keep GGTT vmas first to make debug easier */
> +	if (i915_is_ggtt(vm))
> +		list_add(&vma->vma_link, &obj->vma_list);
> +	else
> +		list_add_tail(&vma->vma_link, &obj->vma_list);
>  
>  	fenceable =
> +		i915_is_ggtt(vm) &&
>  		i915_gem_obj_ggtt_size(obj) == fence_size &&
>  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
>  
> -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> -		dev_priv->gtt.mappable_end;
> +	mappable =
> +		i915_is_ggtt(vm) &&
> +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>  
>  	obj->map_and_fenceable = mappable && fenceable;
>  
> -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> +	trace_i915_vma_bind(vma, map_and_fenceable);
>  	i915_gem_verify_gtt(dev);
>  	return 0;
>  
> @@ -3335,7 +3364,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  
>  	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>  		if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_vma_unbind(vma);
>  			if (ret)
>  				return ret;
>  
> @@ -3643,33 +3672,39 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  		    bool map_and_fenceable,
>  		    bool nonblocking)
>  {
> +	struct i915_vma *vma;
>  	int ret;
>  
>  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>  		return -EBUSY;
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> +	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
> +
> +	vma = i915_gem_obj_to_vma(obj, vm);
> +
> +	if (vma) {
> +		if ((alignment &&
> +		     vma->node.start & (alignment - 1)) ||
>  		    (map_and_fenceable && !obj->map_and_fenceable)) {
>  			WARN(obj->pin_count,
>  			     "bo is already pinned with incorrect alignment:"
>  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>  			     " obj->map_and_fenceable=%d\n",
> -			     i915_gem_obj_ggtt_offset(obj), alignment,
> +			     i915_gem_obj_offset(obj, vm), alignment,
>  			     map_and_fenceable,
>  			     obj->map_and_fenceable);
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_vma_unbind(vma);

If I read this correctly then we wont' call i915_gem_vma_destroy anymore
and so will leak the vma. Is that correct? If so I guess a new slab for
vmas could be handy to easily detect such bugs.

>  			if (ret)
>  				return ret;
>  		}
>  	}
>  
> -	if (!i915_gem_obj_ggtt_bound(obj)) {
> +	if (!i915_gem_obj_bound(obj, vm)) {
>  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  
> -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> -						  map_and_fenceable,
> -						  nonblocking);
> +		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
> +						 map_and_fenceable,
> +						 nonblocking);
>  		if (ret)
>  			return ret;
>  
> @@ -3961,6 +3996,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct i915_vma *vma, *next;
>  
>  	trace_i915_gem_object_destroy(obj);
>  
> @@ -3968,15 +4004,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  		i915_gem_detach_phys_object(dev, obj);
>  
>  	obj->pin_count = 0;
> -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> -		bool was_interruptible;
> +	/* NB: 0 or 1 elements */
> +	WARN_ON(!list_empty(&obj->vma_list) &&
> +		!list_is_singular(&obj->vma_list));
> +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> +		int ret = i915_vma_unbind(vma);
> +		if (WARN_ON(ret == -ERESTARTSYS)) {
> +			bool was_interruptible;
>  
> -		was_interruptible = dev_priv->mm.interruptible;
> -		dev_priv->mm.interruptible = false;
> +			was_interruptible = dev_priv->mm.interruptible;
> +			dev_priv->mm.interruptible = false;
>  
> -		WARN_ON(i915_gem_object_unbind(obj));
> +			WARN_ON(i915_vma_unbind(vma));
>  
> -		dev_priv->mm.interruptible = was_interruptible;
> +			dev_priv->mm.interruptible = was_interruptible;
> +		}

Hm, I think we've released sufficient amounts of kernels and never hit the
above WARN_ON. Can I volunteer you for a follow-up patch to rip out the
code here (but keep the WARN_ON ofc)?

>  	}
>  
>  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 33d85a4..9205a41 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -147,7 +147,7 @@ found:
>  				       struct drm_i915_gem_object,
>  				       exec_list);
>  		if (ret == 0)
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_ggtt_unbind(obj);
>  
>  		list_del_init(&obj->exec_list);
>  		drm_gem_object_unreference(&obj->base);
> @@ -185,7 +185,7 @@ i915_gem_evict_everything(struct drm_device *dev)
>  	/* Having flushed everything, unbind() should never raise an error */
>  	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
>  		if (obj->pin_count == 0)
> -			WARN_ON(i915_gem_object_unbind(obj));
> +			WARN_ON(i915_gem_object_ggtt_unbind(obj));
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index a23b80f..5e68f1e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -556,7 +556,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  			if ((entry->alignment &&
>  			     obj_offset & (entry->alignment - 1)) ||
>  			    (need_mappable && !obj->map_and_fenceable))
> -				ret = i915_gem_object_unbind(obj);
> +				ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));
>  			else
>  				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 92a8d27..032e9ef 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -360,17 +360,18 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  
>  		obj->map_and_fenceable =
>  			!i915_gem_obj_ggtt_bound(obj) ||
> -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> +			(i915_gem_obj_ggtt_offset(obj) +
> +			 obj->base.size <= dev_priv->gtt.mappable_end &&
>  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
>  
>  		/* Rebind if we need a change of alignment */
>  		if (!obj->map_and_fenceable) {
> -			u32 unfenced_alignment =
> +			u32 unfenced_align =
>  				i915_gem_get_gtt_alignment(dev, obj->base.size,
>  							    args->tiling_mode,
>  							    false);
> -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> -				ret = i915_gem_object_unbind(obj);
> +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> +				ret = i915_gem_object_ggtt_unbind(obj);
>  		}
>  
>  		if (ret == 0) {
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 7d283b5..931e2c6 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -33,47 +33,52 @@ TRACE_EVENT(i915_gem_object_create,
>  	    TP_printk("obj=%p, size=%u", __entry->obj, __entry->size)
>  );
>  
> -TRACE_EVENT(i915_gem_object_bind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> -	    TP_ARGS(obj, mappable),
> +TRACE_EVENT(i915_vma_bind,
> +	    TP_PROTO(struct i915_vma *vma, bool mappable),
> +	    TP_ARGS(vma, mappable),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     __field(bool, mappable)
>  			     ),
>  
>  	    TP_fast_assign(
> -			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->obj = vma->obj;
> +			   __entry->vm = vma->vm;
> +			   __entry->offset = vma->node.start;
> +			   __entry->size = vma->node.size;
>  			   __entry->mappable = mappable;
>  			   ),
>  
> -	    TP_printk("obj=%p, offset=%08x size=%x%s",
> +	    TP_printk("obj=%p, offset=%08x size=%x%s vm=%p",
>  		      __entry->obj, __entry->offset, __entry->size,
> -		      __entry->mappable ? ", mappable" : "")
> +		      __entry->mappable ? ", mappable" : "",
> +		      __entry->vm)
>  );
>  
> -TRACE_EVENT(i915_gem_object_unbind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj),
> -	    TP_ARGS(obj),
> +TRACE_EVENT(i915_vma_unbind,
> +	    TP_PROTO(struct i915_vma *vma),
> +	    TP_ARGS(vma),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     ),
>  
>  	    TP_fast_assign(
> -			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->obj = vma->obj;
> +			   __entry->vm = vma->vm;
> +			   __entry->offset = vma->node.start;
> +			   __entry->size = vma->node.size;
>  			   ),
>  
> -	    TP_printk("obj=%p, offset=%08x size=%x",
> -		      __entry->obj, __entry->offset, __entry->size)
> +	    TP_printk("obj=%p, offset=%08x size=%x vm=%p",
> +		      __entry->obj, __entry->offset, __entry->size, __entry->vm)
>  );
>  
>  TRACE_EVENT(i915_gem_object_change_domain,
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-08-01  0:00 ` [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code Ben Widawsky
@ 2013-08-06 18:39   ` Daniel Vetter
  2013-08-06 21:27     ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 18:39 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:11PM -0700, Ben Widawsky wrote:
> Eviction code, like the rest of the converted code needs to be aware of
> the address space for which it is evicting (or the everything case, all
> addresses). With the updated bind/unbind interfaces of the last patch,
> we can now safely move the eviction code over.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Two comments below.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h       |  4 ++-
>  drivers/gpu/drm/i915/i915_gem.c       |  2 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c | 53 +++++++++++++++++++----------------
>  3 files changed, 33 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 0610588..bf1ecef 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1946,7 +1946,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
>  
>  
>  /* i915_gem_evict.c */
> -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> +int __must_check i915_gem_evict_something(struct drm_device *dev,
> +					  struct i915_address_space *vm,
> +					  int min_size,
>  					  unsigned alignment,
>  					  unsigned cache_level,
>  					  bool mappable,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0cb36c2..1013105 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3159,7 +3159,7 @@ search_free:
>  						  size, alignment,
>  						  obj->cache_level, 0, gtt_max);
>  	if (ret) {
> -		ret = i915_gem_evict_something(dev, size, alignment,
> +		ret = i915_gem_evict_something(dev, vm, size, alignment,
>  					       obj->cache_level,
>  					       map_and_fenceable,
>  					       nonblocking);
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 9205a41..61bf5e2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -32,26 +32,21 @@
>  #include "i915_trace.h"
>  
>  static bool
> -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> +mark_free(struct i915_vma *vma, struct list_head *unwind)
>  {
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
> -
> -	if (obj->pin_count)
> +	if (vma->obj->pin_count)
>  		return false;
>  
> -	list_add(&obj->exec_list, unwind);
> +	list_add(&vma->obj->exec_list, unwind);
>  	return drm_mm_scan_add_block(&vma->node);
>  }
>  
>  int
> -i915_gem_evict_something(struct drm_device *dev, int min_size,
> -			 unsigned alignment, unsigned cache_level,
> +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> +			 int min_size, unsigned alignment, unsigned cache_level,
>  			 bool mappable, bool nonblocking)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct list_head eviction_list, unwind_list;
>  	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
> @@ -83,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  	 */
>  
>  	INIT_LIST_HEAD(&unwind_list);
> -	if (mappable)
> +	if (mappable) {
> +		BUG_ON(!i915_is_ggtt(vm));
>  		drm_mm_init_scan_with_range(&vm->mm, min_size,
>  					    alignment, cache_level, 0,
>  					    dev_priv->gtt.mappable_end);
> -	else
> +	} else
>  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
>  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -101,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  
>  	/* Now merge in the soon-to-be-expired objects... */
>  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -111,7 +109,7 @@ none:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		ret = drm_mm_scan_remove_block(&vma->node);
>  		BUG_ON(ret);
>  
> @@ -132,7 +130,7 @@ found:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		if (drm_mm_scan_remove_block(&vma->node)) {
>  			list_move(&obj->exec_list, &eviction_list);
>  			drm_gem_object_reference(&obj->base);
> @@ -147,7 +145,7 @@ found:
>  				       struct drm_i915_gem_object,
>  				       exec_list);
>  		if (ret == 0)
> -			ret = i915_gem_object_ggtt_unbind(obj);
> +			ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));

Again I think the ggtt_unbind->vma_unbind conversion seems to leak the
vma. It feels like vma_unbind should call vma_destroy?

>  
>  		list_del_init(&obj->exec_list);
>  		drm_gem_object_unreference(&obj->base);
> @@ -160,13 +158,18 @@ int
>  i915_gem_evict_everything(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj, *next;
> -	bool lists_empty;
> +	bool lists_empty = true;
>  	int ret;
>  
> -	lists_empty = (list_empty(&vm->inactive_list) &&
> -		       list_empty(&vm->active_list));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		lists_empty = (list_empty(&vm->inactive_list) &&
> +			       list_empty(&vm->active_list));
> +		if (!lists_empty)
> +			lists_empty = false;
> +	}
> +
>  	if (lists_empty)
>  		return -ENOSPC;
>  
> @@ -183,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
>  	i915_gem_retire_requests(dev);
>  
>  	/* Having flushed everything, unbind() should never raise an error */
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -		if (obj->pin_count == 0)
> -			WARN_ON(i915_gem_object_ggtt_unbind(obj));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> +			if (obj->pin_count == 0)
> +				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
> +	}

The conversion of evict_everything looks a bit strange. Essentially we
have tree callers:
- ums+gem support code in leavevt to rid the gtt of all gem objects when
  the userspace X ums ddx stops controlling the hw.
- When we seriously ran out of memory in, shrink_all.
- In execbuf when we've fragmented the gtt address space so badly that we
  need to start over completely fresh.

With this it imo would make sense to just loop over the global bound
object lists. But for the execbuf caller adding a vm parameter (and only
evicting from that special vm, skipping all others) would make sense.
Other callers would pass NULL since they want everything to get evicted.
Volunteered for that follow-up?

>  
>  	return 0;
>  }
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any
  2013-08-01  0:00 ` [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any Ben Widawsky
  2013-08-03 11:03   ` Chris Wilson
@ 2013-08-06 18:43   ` Daniel Vetter
  2013-08-06 21:29     ` Ben Widawsky
  1 sibling, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 18:43 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:12PM -0700, Ben Widawsky wrote:
> In some places, we want to know if an object is bound in any address
> space, and not just the global GTT. This often applies when there is a
> single global resource (object, pages, etc.)
> 
> function                             |      reason
> --------------------------------------------------
> i915_gem_object_is_inactive          | global object
> i915_gem_object_put_pages            | object's pages
> 915_gem_object_unpin                 | global object
> i915_gem_execbuffer_unreserve_object | temporary until we plumb vma
> pread/pread                          | object's domain

pread/pwrite isn't about the object's domain at all, but purely about
synchronizing for outstanding rendering. Replacing the call to
set_to_gtt_domain with a wait_rendering would imo improve code
readability. Furthermore we could pimp pread to only block for outstanding
writes and not for reads.

Since you're not the first one to trip over this: Can I volunteer you for
a follow-up patch to fix this?

Otherwise patch looks good.
-Daniel

> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem.c            | 12 ++++++------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1013105..d4d6444 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -122,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> +	return i915_gem_obj_bound_any(obj) && !obj->active;
>  }
>  
>  int
> @@ -408,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
>  		 * anyway again before the next pread happens. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
>  			if (ret)
>  				return ret;
> @@ -725,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>  		 * right away and we therefore have to clflush anyway. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush_after = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
>  			if (ret)
>  				return ret;
> @@ -1659,7 +1659,7 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>  	if (obj->pages_pin_count)
>  		return -EBUSY;
>  
> -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> +	BUG_ON(i915_gem_obj_bound_any(obj));
>  
>  	/* ->put_pages might need to allocate memory for the bit17 swizzle
>  	 * array, hence protect them from being reaped by removing them from gtt
> @@ -3301,7 +3301,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  	int ret;
>  
>  	/* Not valid to be called on unbound objects. */
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return -EINVAL;
>  
>  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> @@ -3725,7 +3725,7 @@ void
>  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
>  {
>  	BUG_ON(obj->pin_count == 0);
> -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> +	BUG_ON(!i915_gem_obj_bound_any(obj));
>  
>  	if (--obj->pin_count == 0)
>  		obj->pin_mappable = false;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 5e68f1e..64dc6b5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -466,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_gem_exec_object2 *entry;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return;
>  
>  	entry = obj->exec_entry;
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 17/29] drm/i915: plumb VM into bind/unbind code
  2013-08-06 18:29   ` Daniel Vetter
@ 2013-08-06 18:54     ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 18:54 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 08:29:47PM +0200, Daniel Vetter wrote:
> On Wed, Jul 31, 2013 at 05:00:10PM -0700, Ben Widawsky wrote:

[snip]

> > @@ -3643,33 +3672,39 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  		    bool map_and_fenceable,
> >  		    bool nonblocking)
> >  {
> > +	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> >  		return -EBUSY;
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > +	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > +
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> > +
> > +	if (vma) {
> > +		if ((alignment &&
> > +		     vma->node.start & (alignment - 1)) ||
> >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> >  			WARN(obj->pin_count,
> >  			     "bo is already pinned with incorrect alignment:"
> >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> >  			     " obj->map_and_fenceable=%d\n",
> > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > +			     i915_gem_obj_offset(obj, vm), alignment,
> >  			     map_and_fenceable,
> >  			     obj->map_and_fenceable);
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_vma_unbind(vma);
> 
> If I read this correctly then we wont' call i915_gem_vma_destroy anymore
> and so will leak the vma. Is that correct? If so I guess a new slab for
> vmas could be handy to easily detect such bugs.

On re-reading all seems to be fine here since object_unbind was converted
to vma_unbind and so inherited the call to vma_destroy. So no leak here.
The other stuff isn't really critical, so I'll merge this patch (and the
next one).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 20/29] drm/i915: Fix up map and fenceable for VMA
  2013-08-01  0:00 ` [PATCH 20/29] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
@ 2013-08-06 19:11   ` Daniel Vetter
  2013-08-07 18:37     ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 19:11 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:13PM -0700, Ben Widawsky wrote:
> formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
> tracking"
> 
> The map_and_fenceable tracking is per object. GTT mapping, and fences
> only apply to global GTT. As such,  object operations which are not
> performed on the global GTT should not effect mappable or fenceable
> characteristics.
> 
> Functionally, this commit could very well be squashed in to a previous
> patch which updated object operations to take a VM argument.  This
> commit is split out because it's a bit tricky (or at least it was for
> me).
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d4d6444..ec23a5c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2626,7 +2626,7 @@ int i915_vma_unbind(struct i915_vma *vma)
>  
>  	trace_i915_vma_unbind(vma);
>  
> -	if (obj->has_global_gtt_mapping)
> +	if (obj->has_global_gtt_mapping && i915_is_ggtt(vma->vm))
>  		i915_gem_gtt_unbind_object(obj);
>  	if (obj->has_aliasing_ppgtt_mapping) {
>  		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);

Hm, shouldn't we do the is_ggtt check for both? After all only the global
ggtt can be aliased ever ... This would also be more symmetric with some
of the other global gtt checks I've spotted. You're take or will that run
afoul of your Great Plan?
-Daniel

> @@ -2637,7 +2637,8 @@ int i915_vma_unbind(struct i915_vma *vma)
>  
>  	list_del(&obj->mm_list);
>  	/* Avoid an unnecessary call to unbind on rebind. */
> -	obj->map_and_fenceable = true;
> +	if (i915_is_ggtt(vma->vm))
> +		obj->map_and_fenceable = true;
>  
>  	i915_gem_vma_destroy(vma);
>  
> @@ -3196,7 +3197,9 @@ search_free:
>  		i915_is_ggtt(vm) &&
>  		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>  
> -	obj->map_and_fenceable = mappable && fenceable;
> +	/* Map and fenceable only changes if the VM is the global GGTT */
> +	if (i915_is_ggtt(vm))
> +		obj->map_and_fenceable = mappable && fenceable;
>  
>  	trace_i915_vma_bind(vma, map_and_fenceable);
>  	i915_gem_verify_gtt(dev);
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 21/29] drm/i915: mm_list is per VMA
  2013-08-01  0:00 ` [PATCH 21/29] drm/i915: mm_list is per VMA Ben Widawsky
@ 2013-08-06 19:38   ` Daniel Vetter
  2013-08-07  0:28     ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 19:38 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:14PM -0700, Ben Widawsky wrote:
> formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
> 
> The mm_list is used for the active/inactive LRUs. Since those LRUs are
> per address space, the link should be per VMx .
> 
> Because we'll only ever have 1 VMA before this point, it's not incorrect
> to defer this change until this point in the patch series, and doing it
> here makes the change much easier to understand.
> 
> Shamelessly manipulated out of Daniel:
> "active/inactive stuff is used by eviction when we run out of address
> space, so needs to be per-vma and per-address space. Bound/unbound otoh
> is used by the shrinker which only cares about the amount of memory used
> and not one bit about in which address space this memory is all used in.
> Of course to actual kick out an object we need to unbind it from every
> address space, but for that we have the per-object list of vmas."
> 
> v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
> 
> v3: Moved earlier in the series
> 
> v4: Add dropped message from v3
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Some comments below for this one. The lru changes look a bit strange so
I'll wait for your confirmation that the do_switch hunk has the same
reasons s the one in execbuf with the FIXME comment.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        | 53 ++++++++++++++++++++----------
>  drivers/gpu/drm/i915/i915_drv.h            |  5 +--
>  drivers/gpu/drm/i915/i915_gem.c            | 37 +++++++++++----------
>  drivers/gpu/drm/i915/i915_gem_context.c    |  3 ++
>  drivers/gpu/drm/i915/i915_gem_evict.c      | 14 ++++----
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 ++
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |  2 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c      | 37 ++++++++++++---------
>  8 files changed, 91 insertions(+), 62 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 6d5ca85bd..181e5a6 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -149,7 +149,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	size_t total_obj_size, total_gtt_size;
>  	int count, ret;
>  
> @@ -157,6 +157,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	if (ret)
>  		return ret;
>  
> +	/* FIXME: the user of this interface might want more than just GGTT */
>  	switch (list) {
>  	case ACTIVE_LIST:
>  		seq_puts(m, "Active:\n");
> @@ -172,12 +173,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	}
>  
>  	total_obj_size = total_gtt_size = count = 0;
> -	list_for_each_entry(obj, head, mm_list) {
> -		seq_puts(m, "   ");
> -		describe_obj(m, obj);
> -		seq_putc(m, '\n');
> -		total_obj_size += obj->base.size;
> -		total_gtt_size += i915_gem_obj_ggtt_size(obj);
> +	list_for_each_entry(vma, head, mm_list) {
> +		seq_printf(m, "   ");
> +		describe_obj(m, vma->obj);
> +		seq_printf(m, "\n");
> +		total_obj_size += vma->obj->base.size;
> +		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);

Why not use vma->node.size? If you don't disagree I'll bikeshed this while
applying.

>  		count++;
>  	}
>  	mutex_unlock(&dev->struct_mutex);
> @@ -224,7 +225,18 @@ static int per_file_stats(int id, void *ptr, void *data)
>  	return 0;
>  }
>  
> -static int i915_gem_object_info(struct seq_file *m, void *data)
> +#define count_vmas(list, member) do { \
> +	list_for_each_entry(vma, list, member) { \
> +		size += i915_gem_obj_ggtt_size(vma->obj); \
> +		++count; \
> +		if (vma->obj->map_and_fenceable) { \
> +			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
> +			++mappable_count; \
> +		} \
> +	} \
> +} while (0)
> +
> +static int i915_gem_object_info(struct seq_file *m, void* data)
>  {
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
> @@ -234,6 +246,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
>  	struct drm_i915_gem_object *obj;
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct drm_file *file;
> +	struct i915_vma *vma;
>  	int ret;
>  
>  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> @@ -253,12 +266,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
>  		   count, mappable_count, size, mappable_size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> -	count_objects(&vm->active_list, mm_list);
> +	count_vmas(&vm->active_list, mm_list);
>  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> -	count_objects(&vm->inactive_list, mm_list);
> +	count_vmas(&vm->inactive_list, mm_list);
>  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
> @@ -1771,7 +1784,8 @@ i915_drop_caches_set(void *data, u64 val)
>  	struct drm_device *dev = data;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_object *obj, *next;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma, *x;
>  	int ret;
>  
>  	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
> @@ -1792,13 +1806,16 @@ i915_drop_caches_set(void *data, u64 val)
>  		i915_gem_retire_requests(dev);
>  
>  	if (val & DROP_BOUND) {
> -		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -					 mm_list) {
> -			if (obj->pin_count)
> -				continue;
> -			ret = i915_gem_object_ggtt_unbind(obj);
> -			if (ret)
> -				goto unlock;
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +			list_for_each_entry_safe(vma, x, &vm->inactive_list,
> +						 mm_list) {

Imo the double-loop is a bit funny, looping over the global bound list
and skipping all active objects is imo the more straightfoward logic. But
I agree that this is the more straightforward conversion, so I'm ok with a
follow-up fixup patch.

> +				if (vma->obj->pin_count)
> +					continue;
> +
> +				ret = i915_vma_unbind(vma);
> +				if (ret)
> +					goto unlock;
> +			}
>  		}
>  	}
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bf1ecef..220699b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -546,6 +546,9 @@ struct i915_vma {
>  	struct drm_i915_gem_object *obj;
>  	struct i915_address_space *vm;
>  
> +	/** This object's place on the active/inactive lists */
> +	struct list_head mm_list;
> +
>  	struct list_head vma_link; /* Link in the object's VMA list */
>  };
>  
> @@ -1263,9 +1266,7 @@ struct drm_i915_gem_object {
>  	struct drm_mm_node *stolen;
>  	struct list_head global_list;
>  
> -	/** This object's place on the active/inactive lists */
>  	struct list_head ring_list;
> -	struct list_head mm_list;
>  	/** This object's place in the batchbuffer or on the eviction list */
>  	struct list_head exec_list;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ec23a5c..fb3f02f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1872,7 +1872,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 seqno = intel_ring_get_seqno(ring);
>  
>  	BUG_ON(ring == NULL);
> @@ -1888,8 +1887,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  		obj->active = 1;
>  	}
>  
> -	/* Move from whatever list we were on to the tail of execution. */
> -	list_move_tail(&obj->mm_list, &vm->active_list);
>  	list_move_tail(&obj->ring_list, &ring->active_list);
>  
>  	obj->last_read_seqno = seqno;
> @@ -1911,14 +1908,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  static void
>  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  {
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
>  
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> -	list_move_tail(&obj->mm_list, &vm->inactive_list);
> +	list_move_tail(&vma->mm_list, &ggtt_vm->inactive_list);
>  
>  	list_del_init(&obj->ring_list);
>  	obj->ring = NULL;
> @@ -2286,9 +2283,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm;
> -	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma;
>  	int i;
>  
>  	for_each_ring(ring, dev_priv, i)
> @@ -2298,8 +2295,8 @@ void i915_gem_reset(struct drm_device *dev)
>  	 * necessary invalidation upon reuse.
>  	 */
>  	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> -		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +		list_for_each_entry(vma, &vm->inactive_list, mm_list)
> +			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>  
>  	i915_gem_restore_fences(dev);
>  }
> @@ -2353,6 +2350,7 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>  			break;
>  
> +		BUG_ON(!obj->active);
>  		i915_gem_object_move_to_inactive(obj);
>  	}
>  
> @@ -2635,7 +2633,6 @@ int i915_vma_unbind(struct i915_vma *vma)
>  	i915_gem_gtt_finish_object(obj);
>  	i915_gem_object_unpin_pages(obj);
>  
> -	list_del(&obj->mm_list);
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	if (i915_is_ggtt(vma->vm))
>  		obj->map_and_fenceable = true;
> @@ -3180,7 +3177,7 @@ search_free:
>  		goto err_out;
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> +	list_add_tail(&vma->mm_list, &vm->inactive_list);
>  
>  	/* Keep GGTT vmas first to make debug easier */
>  	if (i915_is_ggtt(vm))
> @@ -3342,9 +3339,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  					    old_write_domain);
>  
>  	/* And bump the LRU for this access */
> -	if (i915_gem_object_is_inactive(obj))
> -		list_move_tail(&obj->mm_list,
> -			       &dev_priv->gtt.base.inactive_list);
> +	if (i915_gem_object_is_inactive(obj)) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> +							   &dev_priv->gtt.base);
> +		if (vma)
> +			list_move_tail(&vma->mm_list,
> +				       &dev_priv->gtt.base.inactive_list);
> +
> +	}
>  
>  	return 0;
>  }
> @@ -3917,7 +3919,6 @@ unlock:
>  void i915_gem_object_init(struct drm_i915_gem_object *obj,
>  			  const struct drm_i915_gem_object_ops *ops)
>  {
> -	INIT_LIST_HEAD(&obj->mm_list);
>  	INIT_LIST_HEAD(&obj->global_list);
>  	INIT_LIST_HEAD(&obj->ring_list);
>  	INIT_LIST_HEAD(&obj->exec_list);
> @@ -4054,6 +4055,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  		return ERR_PTR(-ENOMEM);
>  
>  	INIT_LIST_HEAD(&vma->vma_link);
> +	INIT_LIST_HEAD(&vma->mm_list);
>  	vma->vm = vm;
>  	vma->obj = obj;
>  
> @@ -4063,6 +4065,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  void i915_gem_vma_destroy(struct i915_vma *vma)
>  {
>  	list_del_init(&vma->vma_link);
> +	list_del(&vma->mm_list);
>  	drm_mm_remove_node(&vma->node);
>  	kfree(vma);
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index d1cb28c..88b0f52 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -436,7 +436,10 @@ static int do_switch(struct i915_hw_context *to)
>  	 * MI_SET_CONTEXT instead of when the next seqno has completed.
>  	 */
>  	if (from != NULL) {
> +		struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
> +		struct i915_address_space *ggtt = &dev_priv->gtt.base;
>  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> +		list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);

I don't really see a reason to add this here ... shouldn't move_to_active
take care of this? Obviously not in this patch here but later on when it's
converted over.

>  		i915_gem_object_move_to_active(from->obj, ring);
>  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
>  		 * whole damn pipeline, we don't need to explicitly mark the
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 61bf5e2..425939b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
>  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
>  		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
> @@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
>  		goto none;
>  
>  	/* Now merge in the soon-to-be-expired objects... */
> -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +	list_for_each_entry(vma, &vm->active_list, mm_list) {
>  		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
> @@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct i915_address_space *vm;
> -	struct drm_i915_gem_object *obj, *next;
> +	struct i915_vma *vma, *next;
>  	bool lists_empty = true;
>  	int ret;
>  
> @@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
>  
>  	/* Having flushed everything, unbind() should never raise an error */
>  	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> -		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -			if (obj->pin_count == 0)
> -				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
> +		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> +			if (vma->obj->pin_count == 0)
> +				WARN_ON(i915_vma_unbind(vma));
>  	}
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 64dc6b5..0f21702 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -801,6 +801,8 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
>  		obj->base.read_domains = obj->base.pending_read_domains;
>  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>  
> +		/* FIXME: This lookup gets fixed later <-- danvet */
> +		list_move_tail(&i915_gem_obj_to_vma(obj, vm)->mm_list, &vm->active_list);

Ah, I guess the same comment applies to the lru frobbing in do_switch?

>  		i915_gem_object_move_to_active(obj, ring);
>  		if (obj->base.write_domain) {
>  			obj->dirty = 1;
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 000ffbd..fa60103 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	obj->has_global_gtt_mapping = 1;
>  
>  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
> +	list_add_tail(&vma->mm_list, &ggtt->inactive_list);
>  
>  	return obj;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index d970d84..9623a4e 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -556,11 +556,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
>  static u32 capture_active_bo(struct drm_i915_error_buffer *err,
>  			     int count, struct list_head *head)
>  {
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	int i = 0;
>  
> -	list_for_each_entry(obj, head, mm_list) {
> -		capture_bo(err++, obj);
> +	list_for_each_entry(vma, head, mm_list) {
> +		capture_bo(err++, vma->obj);
>  		if (++i == count)
>  			break;
>  	}
> @@ -622,7 +622,8 @@ static struct drm_i915_error_object *
>  i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  			     struct intel_ring_buffer *ring)
>  {
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
>  	u32 seqno;
>  
> @@ -642,20 +643,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  	}
>  
>  	seqno = ring->get_seqno(ring, false);
> -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		if (obj->ring != ring)
> -			continue;
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		list_for_each_entry(vma, &vm->active_list, mm_list) {

We could instead loop over the bound list and check for ->active. But this
is ok, too albeit a bit convoluted imo.

> +			obj = vma->obj;
> +			if (obj->ring != ring)
> +				continue;
>  
> -		if (i915_seqno_passed(seqno, obj->last_read_seqno))
> -			continue;
> +			if (i915_seqno_passed(seqno, obj->last_read_seqno))
> +				continue;
>  
> -		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> -			continue;
> +			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> +				continue;
>  
> -		/* We need to copy these to an anonymous buffer as the simplest
> -		 * method to avoid being overwritten by userspace.
> -		 */
> -		return i915_error_object_create(dev_priv, obj);
> +			/* We need to copy these to an anonymous buffer as the simplest
> +			 * method to avoid being overwritten by userspace.
> +			 */
> +			return i915_error_object_create(dev_priv, obj);
> +		}
>  	}
>  
>  	return NULL;
> @@ -775,11 +779,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
>  				     struct drm_i915_error_state *error)
>  {
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
>  	int i;
>  
>  	i = 0;
> -	list_for_each_entry(obj, &vm->active_list, mm_list)
> +	list_for_each_entry(vma, &vm->active_list, mm_list)
>  		i++;
>  	error->active_bo_count = i;
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/29] drm/i915: Convert execbuf code to use vmas
  2013-08-01  0:00 ` [PATCH 25/29] drm/i915: Convert execbuf code to use vmas Ben Widawsky
@ 2013-08-06 20:43   ` Daniel Vetter
  2013-08-06 20:45     ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 20:43 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:18PM -0700, Ben Widawsky wrote:
> This attempts to convert all the execbuf code to speak in vmas. Since
> the execbuf code is very self contained it was a nice isolated
> conversion.
> 
> The meat of the code is about turning eb_objects into eb_vma, and then
> wiring up the rest of the code to use vmas instead of obj, vm pairs.
> 
> Unfortunately, to do this, we must move the exec_list link from the obj
> structure. This list is reused in the eviction code, so we must also
> modify the eviction code to make this work.
> 
> v2: Release table lock early, and two a 2 phase vma lookup to avoid
> having to use a GFP_ATOMIC. (Chris)
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

I think the leaking of preallocated vmas if execbuf fails can blow up:
1. We call lookup_or_create and create new vmas, linked into the vma_link
chain.
2. Later on execbuf fails somewhere (for an igt the simplest way is
probaly to use more buffers than what would fit into the gtt) and we bail
out.
-> Note that at this point we leak vmas which are on the vma_link list but
which have no gtt node allocation.
3. Userspace dies in flames (or just quits).
4. All buffers get their final unref and we call vma_unbind on each vma,
even the ones that do not have an allocation.
5. hilarity ensues since vma_unbind doesn't bail out if
drm_mm_node_allocated(vma->node) == false.

We need broken userspace to actually exercise this bug since all normal
ways for execbuf to bail out involve singals and ioclt restarting. If this
is a real bug I think we need an igt to exercise it.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  22 +-
>  drivers/gpu/drm/i915/i915_gem.c            |   3 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  31 ++-
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 322 +++++++++++++++--------------
>  4 files changed, 201 insertions(+), 177 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c0eb7fd..ee5164e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -550,6 +550,17 @@ struct i915_vma {
>  	struct list_head mm_list;
>  
>  	struct list_head vma_link; /* Link in the object's VMA list */
> +
> +	/** This vma's place in the batchbuffer or on the eviction list */
> +	struct list_head exec_list;
> +
> +	/**
> +	 * Used for performing relocations during execbuffer insertion.
> +	 */
> +	struct hlist_node exec_node;
> +	unsigned long exec_handle;
> +	struct drm_i915_gem_exec_object2 *exec_entry;
> +
>  };
>  
>  struct i915_ctx_hang_stats {
> @@ -1267,8 +1278,8 @@ struct drm_i915_gem_object {
>  	struct list_head global_list;
>  
>  	struct list_head ring_list;
> -	/** This object's place in the batchbuffer or on the eviction list */
> -	struct list_head exec_list;
> +	/** Used in execbuf to temporarily hold a ref */
> +	struct list_head obj_exec_list;
>  
>  	/**
>  	 * This is set if the object is on the active lists (has pending
> @@ -1353,13 +1364,6 @@ struct drm_i915_gem_object {
>  	void *dma_buf_vmapping;
>  	int vmapping_count;
>  
> -	/**
> -	 * Used for performing relocations during execbuffer insertion.
> -	 */
> -	struct hlist_node exec_node;
> -	unsigned long exec_handle;
> -	struct drm_i915_gem_exec_object2 *exec_entry;
> -
>  	struct intel_ring_buffer *ring;
>  
>  	/** Breadcrumb of last rendering to the buffer. */
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 72bd53c..a4ba819 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3913,7 +3913,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
>  {
>  	INIT_LIST_HEAD(&obj->global_list);
>  	INIT_LIST_HEAD(&obj->ring_list);
> -	INIT_LIST_HEAD(&obj->exec_list);
> +	INIT_LIST_HEAD(&obj->obj_exec_list);
>  	INIT_LIST_HEAD(&obj->vma_list);
>  
>  	obj->ops = ops;
> @@ -4048,6 +4048,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  
>  	INIT_LIST_HEAD(&vma->vma_link);
>  	INIT_LIST_HEAD(&vma->mm_list);
> +	INIT_LIST_HEAD(&vma->exec_list);
>  	vma->vm = vm;
>  	vma->obj = obj;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 425939b..8787588 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -37,7 +37,7 @@ mark_free(struct i915_vma *vma, struct list_head *unwind)
>  	if (vma->obj->pin_count)
>  		return false;
>  
> -	list_add(&vma->obj->exec_list, unwind);
> +	list_add(&vma->exec_list, unwind);
>  	return drm_mm_scan_add_block(&vma->node);
>  }
>  
> @@ -49,7 +49,6 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct list_head eviction_list, unwind_list;
>  	struct i915_vma *vma;
> -	struct drm_i915_gem_object *obj;
>  	int ret = 0;
>  
>  	trace_i915_gem_evict(dev, min_size, alignment, mappable);
> @@ -104,14 +103,13 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
>  none:
>  	/* Nothing found, clean up and bail out! */
>  	while (!list_empty(&unwind_list)) {
> -		obj = list_first_entry(&unwind_list,
> -				       struct drm_i915_gem_object,
> +		vma = list_first_entry(&unwind_list,
> +				       struct i915_vma,
>  				       exec_list);
> -		vma = i915_gem_obj_to_vma(obj, vm);
>  		ret = drm_mm_scan_remove_block(&vma->node);
>  		BUG_ON(ret);
>  
> -		list_del_init(&obj->exec_list);
> +		list_del_init(&vma->exec_list);
>  	}
>  
>  	/* We expect the caller to unpin, evict all and try again, or give up.
> @@ -125,28 +123,27 @@ found:
>  	 * temporary list. */
>  	INIT_LIST_HEAD(&eviction_list);
>  	while (!list_empty(&unwind_list)) {
> -		obj = list_first_entry(&unwind_list,
> -				       struct drm_i915_gem_object,
> +		vma = list_first_entry(&unwind_list,
> +				       struct i915_vma,
>  				       exec_list);
> -		vma = i915_gem_obj_to_vma(obj, vm);
>  		if (drm_mm_scan_remove_block(&vma->node)) {
> -			list_move(&obj->exec_list, &eviction_list);
> -			drm_gem_object_reference(&obj->base);
> +			list_move(&vma->exec_list, &eviction_list);
> +			drm_gem_object_reference(&vma->obj->base);
>  			continue;
>  		}
> -		list_del_init(&obj->exec_list);
> +		list_del_init(&vma->exec_list);
>  	}
>  
>  	/* Unbinding will emit any required flushes */
>  	while (!list_empty(&eviction_list)) {
> -		obj = list_first_entry(&eviction_list,
> -				       struct drm_i915_gem_object,
> +		vma = list_first_entry(&eviction_list,
> +				       struct i915_vma,
>  				       exec_list);
>  		if (ret == 0)
> -			ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));
> +			ret = i915_vma_unbind(vma);
>  
> -		list_del_init(&obj->exec_list);
> -		drm_gem_object_unreference(&obj->base);
> +		list_del_init(&vma->exec_list);
> +		drm_gem_object_unreference(&vma->obj->base);
>  	}
>  
>  	return ret;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 3f17a55..1c9d504 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -33,24 +33,24 @@
>  #include "intel_drv.h"
>  #include <linux/dma_remapping.h>
>  
> -struct eb_objects {
> -	struct list_head objects;
> +struct eb_vmas {
> +	struct list_head vmas;
>  	int and;
>  	union {
> -		struct drm_i915_gem_object *lut[0];
> +		struct i915_vma *lut[0];
>  		struct hlist_head buckets[0];
>  	};
>  };
>  
> -static struct eb_objects *
> -eb_create(struct drm_i915_gem_execbuffer2 *args)
> +static struct eb_vmas *
> +eb_create(struct drm_i915_gem_execbuffer2 *args, struct i915_address_space *vm)
>  {
> -	struct eb_objects *eb = NULL;
> +	struct eb_vmas *eb = NULL;
>  
>  	if (args->flags & I915_EXEC_HANDLE_LUT) {
>  		int size = args->buffer_count;
> -		size *= sizeof(struct drm_i915_gem_object *);
> -		size += sizeof(struct eb_objects);
> +		size *= sizeof(struct i915_vma *);
> +		size += sizeof(struct eb_vmas);
>  		eb = kmalloc(size, GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
>  	}
>  
> @@ -61,7 +61,7 @@ eb_create(struct drm_i915_gem_execbuffer2 *args)
>  		while (count > 2*size)
>  			count >>= 1;
>  		eb = kzalloc(count*sizeof(struct hlist_head) +
> -			     sizeof(struct eb_objects),
> +			     sizeof(struct eb_vmas),
>  			     GFP_TEMPORARY);
>  		if (eb == NULL)
>  			return eb;
> @@ -70,72 +70,97 @@ eb_create(struct drm_i915_gem_execbuffer2 *args)
>  	} else
>  		eb->and = -args->buffer_count;
>  
> -	INIT_LIST_HEAD(&eb->objects);
> +	INIT_LIST_HEAD(&eb->vmas);
>  	return eb;
>  }
>  
>  static void
> -eb_reset(struct eb_objects *eb)
> +eb_reset(struct eb_vmas *eb)
>  {
>  	if (eb->and >= 0)
>  		memset(eb->buckets, 0, (eb->and+1)*sizeof(struct hlist_head));
>  }
>  
>  static int
> -eb_lookup_objects(struct eb_objects *eb,
> -		  struct drm_i915_gem_exec_object2 *exec,
> -		  const struct drm_i915_gem_execbuffer2 *args,
> -		  struct i915_address_space *vm,
> -		  struct drm_file *file)
> +eb_lookup_vmas(struct eb_vmas *eb,
> +	       struct drm_i915_gem_exec_object2 *exec,
> +	       const struct drm_i915_gem_execbuffer2 *args,
> +	       struct i915_address_space *vm,
> +	       struct drm_file *file)
>  {
>  	struct drm_i915_gem_object *obj;
> -	int i;
> +	struct list_head objects;
> +	int i, ret = 0;
>  
> +	INIT_LIST_HEAD(&objects);
>  	spin_lock(&file->table_lock);
> +	/* Grab a reference to the object and release the lock so we can lookup
> +	 * or create the VMA without using GFP_ATOMIC */
>  	for (i = 0; i < args->buffer_count; i++) {
>  		obj = to_intel_bo(idr_find(&file->object_idr, exec[i].handle));
>  		if (obj == NULL) {
>  			spin_unlock(&file->table_lock);
>  			DRM_DEBUG("Invalid object handle %d at index %d\n",
>  				   exec[i].handle, i);
> -			return -ENOENT;
> +			ret = -ENOENT;
> +			goto out;
>  		}
>  
> -		if (!list_empty(&obj->exec_list)) {
> +		if (!list_empty(&obj->obj_exec_list)) {
>  			spin_unlock(&file->table_lock);
>  			DRM_DEBUG("Object %p [handle %d, index %d] appears more than once in object list\n",
>  				   obj, exec[i].handle, i);
> -			return -EINVAL;
> +			ret = -EINVAL;
> +			goto out;
>  		}
>  
>  		drm_gem_object_reference(&obj->base);
> -		list_add_tail(&obj->exec_list, &eb->objects);
> +		list_add_tail(&obj->obj_exec_list, &objects);
>  	}
>  	spin_unlock(&file->table_lock);
>  
> -	list_for_each_entry(obj,  &eb->objects, exec_list) {
> +	i = 0;
> +	list_for_each_entry(obj, &objects, obj_exec_list) {
>  		struct i915_vma *vma;
>  
>  		vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
> -		if (IS_ERR(vma))
> -			return PTR_ERR(vma);
> +		if (IS_ERR(vma)) {
> +			/* XXX: We don't need an error path fro vma because if
> +			 * the vma was created just for this execbuf, object
> +			 * unreference should kill it off.*/
> +			DRM_DEBUG("Failed to lookup VMA\n");
> +			ret = PTR_ERR(vma);
> +			goto out;
> +		}
> +
> +		list_add_tail(&vma->exec_list, &eb->vmas);
>  
> -		obj->exec_entry = &exec[i];
> +		vma->exec_entry = &exec[i];
>  		if (eb->and < 0) {
> -			eb->lut[i] = obj;
> +			eb->lut[i] = vma;
>  		} else {
>  			uint32_t handle = args->flags & I915_EXEC_HANDLE_LUT ? i : exec[i].handle;
> -			obj->exec_handle = handle;
> -			hlist_add_head(&obj->exec_node,
> +			vma->exec_handle = handle;
> +			hlist_add_head(&vma->exec_node,
>  				       &eb->buckets[handle & eb->and]);
>  		}
> +		++i;
>  	}
>  
> -	return 0;
> +
> +out:
> +	while (!list_empty(&objects)) {
> +		obj = list_first_entry(&objects,
> +				       struct drm_i915_gem_object,
> +				       obj_exec_list);
> +		list_del_init(&obj->obj_exec_list);
> +		if (ret)
> +			drm_gem_object_unreference(&obj->base);
> +	}
> +	return ret;
>  }
>  
> -static struct drm_i915_gem_object *
> -eb_get_object(struct eb_objects *eb, unsigned long handle)
> +static struct i915_vma *eb_get_vma(struct eb_vmas *eb, unsigned long handle)
>  {
>  	if (eb->and < 0) {
>  		if (handle >= -eb->and)
> @@ -147,27 +172,25 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
>  
>  		head = &eb->buckets[handle & eb->and];
>  		hlist_for_each(node, head) {
> -			struct drm_i915_gem_object *obj;
> +			struct i915_vma *vma;
>  
> -			obj = hlist_entry(node, struct drm_i915_gem_object, exec_node);
> -			if (obj->exec_handle == handle)
> -				return obj;
> +			vma = hlist_entry(node, struct i915_vma, exec_node);
> +			if (vma->exec_handle == handle)
> +				return vma;
>  		}
>  		return NULL;
>  	}
>  }
>  
> -static void
> -eb_destroy(struct eb_objects *eb)
> -{
> -	while (!list_empty(&eb->objects)) {
> -		struct drm_i915_gem_object *obj;
> +static void eb_destroy(struct eb_vmas *eb) {
> +	while (!list_empty(&eb->vmas)) {
> +		struct i915_vma *vma;
>  
> -		obj = list_first_entry(&eb->objects,
> -				       struct drm_i915_gem_object,
> +		vma = list_first_entry(&eb->vmas,
> +				       struct i915_vma,
>  				       exec_list);
> -		list_del_init(&obj->exec_list);
> -		drm_gem_object_unreference(&obj->base);
> +		list_del_init(&vma->exec_list);
> +		drm_gem_object_unreference(&vma->obj->base);
>  	}
>  	kfree(eb);
>  }
> @@ -181,22 +204,24 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
>  
>  static int
>  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> -				   struct eb_objects *eb,
> +				   struct eb_vmas *eb,
>  				   struct drm_i915_gem_relocation_entry *reloc,
>  				   struct i915_address_space *vm)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_gem_object *target_obj;
>  	struct drm_i915_gem_object *target_i915_obj;
> +	struct i915_vma *target_vma;
>  	uint32_t target_offset;
>  	int ret = -EINVAL;
>  
>  	/* we've already hold a reference to all valid objects */
> -	target_obj = &eb_get_object(eb, reloc->target_handle)->base;
> -	if (unlikely(target_obj == NULL))
> +	target_vma = eb_get_vma(eb, reloc->target_handle);
> +	if (unlikely(target_vma == NULL))
>  		return -ENOENT;
> +	target_i915_obj = target_vma->obj;
> +	target_obj = &target_vma->obj->base;
>  
> -	target_i915_obj = to_intel_bo(target_obj);
>  	target_offset = i915_gem_obj_ggtt_offset(target_i915_obj);
>  
>  	/* Sandybridge PPGTT errata: We need a global gtt mapping for MI and
> @@ -305,14 +330,13 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  }
>  
>  static int
> -i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> -				    struct eb_objects *eb,
> -				    struct i915_address_space *vm)
> +i915_gem_execbuffer_relocate_vma(struct i915_vma *vma,
> +				 struct eb_vmas *eb)
>  {
>  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
>  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
>  	struct drm_i915_gem_relocation_entry __user *user_relocs;
> -	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> +	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
>  	int remain, ret;
>  
>  	user_relocs = to_user_ptr(entry->relocs_ptr);
> @@ -331,8 +355,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  		do {
>  			u64 offset = r->presumed_offset;
>  
> -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> -								 vm);
> +			ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, r,
> +								 vma->vm);
>  			if (ret)
>  				return ret;
>  
> @@ -353,17 +377,16 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  }
>  
>  static int
> -i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> -					 struct eb_objects *eb,
> -					 struct drm_i915_gem_relocation_entry *relocs,
> -					 struct i915_address_space *vm)
> +i915_gem_execbuffer_relocate_vma_slow(struct i915_vma *vma,
> +				      struct eb_vmas *eb,
> +				      struct drm_i915_gem_relocation_entry *relocs)
>  {
> -	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> +	const struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
>  	int i, ret;
>  
>  	for (i = 0; i < entry->relocation_count; i++) {
> -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> -							 vm);
> +		ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, &relocs[i],
> +							 vma->vm);
>  		if (ret)
>  			return ret;
>  	}
> @@ -372,10 +395,10 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  }
>  
>  static int
> -i915_gem_execbuffer_relocate(struct eb_objects *eb,
> +i915_gem_execbuffer_relocate(struct eb_vmas *eb,
>  			     struct i915_address_space *vm)
>  {
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	int ret = 0;
>  
>  	/* This is the fast path and we cannot handle a pagefault whilst
> @@ -386,8 +409,8 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb,
>  	 * lockdep complains vehemently.
>  	 */
>  	pagefault_disable();
> -	list_for_each_entry(obj, &eb->objects, exec_list) {
> -		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> +	list_for_each_entry(vma, &eb->vmas, exec_list) {
> +		ret = i915_gem_execbuffer_relocate_vma(vma, eb);
>  		if (ret)
>  			break;
>  	}
> @@ -400,31 +423,31 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb,
>  #define  __EXEC_OBJECT_HAS_FENCE (1<<30)
>  
>  static int
> -need_reloc_mappable(struct drm_i915_gem_object *obj)
> +need_reloc_mappable(struct i915_vma *vma)
>  {
> -	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> -	return entry->relocation_count && !use_cpu_reloc(obj);
> +	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
> +	return entry->relocation_count && !use_cpu_reloc(vma->obj);
>  }
>  
>  static int
> -i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> -				   struct intel_ring_buffer *ring,
> -				   struct i915_address_space *vm,
> -				   bool *need_reloc)
> +i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> +				struct intel_ring_buffer *ring,
> +				bool *need_reloc)
>  {
> -	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> -	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
>  	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
>  	bool need_fence, need_mappable;
> +	struct drm_i915_gem_object *obj = vma->obj;
>  	int ret;
>  
>  	need_fence =
>  		has_fenced_gpu_access &&
>  		entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>  		obj->tiling_mode != I915_TILING_NONE;
> -	need_mappable = need_fence || need_reloc_mappable(obj);
> +	need_mappable = need_fence || need_reloc_mappable(vma);
>  
> -	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> +	ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, need_mappable,
>  				  false);
>  	if (ret)
>  		return ret;
> @@ -452,8 +475,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->has_aliasing_ppgtt_mapping = 1;
>  	}
>  
> -	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> -		entry->offset = i915_gem_obj_offset(obj, vm);
> +	if (entry->offset != vma->node.start) {
> +		entry->offset = vma->node.start;
>  		*need_reloc = true;
>  	}
>  
> @@ -470,61 +493,60 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  }
>  
>  static void
> -i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> +i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma)
>  {
>  	struct drm_i915_gem_exec_object2 *entry;
>  
> -	if (!i915_gem_obj_bound_any(obj))
> +	if (!drm_mm_node_allocated(&vma->node))
>  		return;
>  
> -	entry = obj->exec_entry;
> +	entry = vma->exec_entry;
>  
>  	if (entry->flags & __EXEC_OBJECT_HAS_FENCE)
> -		i915_gem_object_unpin_fence(obj);
> +		i915_gem_object_unpin_fence(vma->obj);
>  
>  	if (entry->flags & __EXEC_OBJECT_HAS_PIN)
> -		i915_gem_object_unpin(obj);
> +		i915_gem_object_unpin(vma->obj);
>  
>  	entry->flags &= ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN);
>  }
>  
>  static int
>  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> -			    struct list_head *objects,
> -			    struct i915_address_space *vm,
> +			    struct list_head *vmas,
>  			    bool *need_relocs)
>  {
>  	struct drm_i915_gem_object *obj;
> -	struct list_head ordered_objects;
> +	struct i915_vma *vma;
> +	struct list_head ordered_vmas;
>  	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
>  	int retry;
>  
> -	INIT_LIST_HEAD(&ordered_objects);
> -	while (!list_empty(objects)) {
> +	INIT_LIST_HEAD(&ordered_vmas);
> +	while (!list_empty(vmas)) {
>  		struct drm_i915_gem_exec_object2 *entry;
>  		bool need_fence, need_mappable;
>  
> -		obj = list_first_entry(objects,
> -				       struct drm_i915_gem_object,
> -				       exec_list);
> -		entry = obj->exec_entry;
> +		vma = list_first_entry(vmas, struct i915_vma, exec_list);
> +		obj = vma->obj;
> +		entry = vma->exec_entry;
>  
>  		need_fence =
>  			has_fenced_gpu_access &&
>  			entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>  			obj->tiling_mode != I915_TILING_NONE;
> -		need_mappable = need_fence || need_reloc_mappable(obj);
> +		need_mappable = need_fence || need_reloc_mappable(vma);
>  
>  		if (need_mappable)
> -			list_move(&obj->exec_list, &ordered_objects);
> +			list_move(&vma->exec_list, &ordered_vmas);
>  		else
> -			list_move_tail(&obj->exec_list, &ordered_objects);
> +			list_move_tail(&vma->exec_list, &ordered_vmas);
>  
>  		obj->base.pending_read_domains = I915_GEM_GPU_DOMAINS & ~I915_GEM_DOMAIN_COMMAND;
>  		obj->base.pending_write_domain = 0;
>  		obj->pending_fenced_gpu_access = false;
>  	}
> -	list_splice(&ordered_objects, objects);
> +	list_splice(&ordered_vmas, vmas);
>  
>  	/* Attempt to pin all of the buffers into the GTT.
>  	 * This is done in 3 phases:
> @@ -543,47 +565,47 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  		int ret = 0;
>  
>  		/* Unbind any ill-fitting objects or pin. */
> -		list_for_each_entry(obj, objects, exec_list) {
> -			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> +		list_for_each_entry(vma, vmas, exec_list) {
> +			struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
>  			bool need_fence, need_mappable;
> -			u32 obj_offset;
>  
> -			if (!i915_gem_obj_bound(obj, vm))
> +			obj = vma->obj;
> +
> +			if (!drm_mm_node_allocated(&vma->node))
>  				continue;
>  
> -			obj_offset = i915_gem_obj_offset(obj, vm);
>  			need_fence =
>  				has_fenced_gpu_access &&
>  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>  				obj->tiling_mode != I915_TILING_NONE;
> -			need_mappable = need_fence || need_reloc_mappable(obj);
> +			need_mappable = need_fence || need_reloc_mappable(vma);
>  
>  			BUG_ON((need_mappable || need_fence) &&
> -			       !i915_is_ggtt(vm));
> +			       !i915_is_ggtt(vma->vm));
>  
>  			if ((entry->alignment &&
> -			     obj_offset & (entry->alignment - 1)) ||
> +			     vma->node.start & (entry->alignment - 1)) ||
>  			    (need_mappable && !obj->map_and_fenceable))
> -				ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));
> +				ret = i915_vma_unbind(vma);
>  			else
> -				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> +				ret = i915_gem_execbuffer_reserve_vma(vma, ring, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
>  
>  		/* Bind fresh objects */
> -		list_for_each_entry(obj, objects, exec_list) {
> -			if (i915_gem_obj_bound(obj, vm))
> +		list_for_each_entry(vma, vmas, exec_list) {
> +			if (drm_mm_node_allocated(&vma->node))
>  				continue;
>  
> -			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> +			ret = i915_gem_execbuffer_reserve_vma(vma, ring, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
>  
>  err:		/* Decrement pin count for bound objects */
> -		list_for_each_entry(obj, objects, exec_list)
> -			i915_gem_execbuffer_unreserve_object(obj);
> +		list_for_each_entry(vma, vmas, exec_list)
> +			i915_gem_execbuffer_unreserve_vma(vma);
>  
>  		if (ret != -ENOSPC || retry++)
>  			return ret;
> @@ -599,24 +621,27 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  				  struct drm_i915_gem_execbuffer2 *args,
>  				  struct drm_file *file,
>  				  struct intel_ring_buffer *ring,
> -				  struct eb_objects *eb,
> -				  struct drm_i915_gem_exec_object2 *exec,
> -				  struct i915_address_space *vm)
> +				  struct eb_vmas *eb,
> +				  struct drm_i915_gem_exec_object2 *exec)
>  {
>  	struct drm_i915_gem_relocation_entry *reloc;
> -	struct drm_i915_gem_object *obj;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma;
>  	bool need_relocs;
>  	int *reloc_offset;
>  	int i, total, ret;
>  	int count = args->buffer_count;
>  
> +	if (WARN_ON(list_empty(&eb->vmas)))
> +		return 0;
> +
> +	vm = list_first_entry(&eb->vmas, struct i915_vma, exec_list)->vm;
> +
>  	/* We may process another execbuffer during the unlock... */
> -	while (!list_empty(&eb->objects)) {
> -		obj = list_first_entry(&eb->objects,
> -				       struct drm_i915_gem_object,
> -				       exec_list);
> -		list_del_init(&obj->exec_list);
> -		drm_gem_object_unreference(&obj->base);
> +	while (!list_empty(&eb->vmas)) {
> +		vma = list_first_entry(&eb->vmas, struct i915_vma, exec_list);
> +		list_del_init(&vma->exec_list);
> +		drm_gem_object_unreference(&vma->obj->base);
>  	}
>  
>  	mutex_unlock(&dev->struct_mutex);
> @@ -680,20 +705,19 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  
>  	/* reacquire the objects */
>  	eb_reset(eb);
> -	ret = eb_lookup_objects(eb, exec, args, vm, file);
> +	ret = eb_lookup_vmas(eb, exec, args, vm, file);
>  	if (ret)
>  		goto err;
>  
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->vmas, &need_relocs);
>  	if (ret)
>  		goto err;
>  
> -	list_for_each_entry(obj, &eb->objects, exec_list) {
> -		int offset = obj->exec_entry - exec;
> -		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> -							       reloc + reloc_offset[offset],
> -							       vm);
> +	list_for_each_entry(vma, &eb->vmas, exec_list) {
> +		int offset = vma->exec_entry - exec;
> +		ret = i915_gem_execbuffer_relocate_vma_slow(vma, eb,
> +							    reloc + reloc_offset[offset]);
>  		if (ret)
>  			goto err;
>  	}
> @@ -712,21 +736,21 @@ err:
>  
>  static int
>  i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring,
> -				struct list_head *objects)
> +				struct list_head *vmas)
>  {
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	uint32_t flush_domains = 0;
>  	int ret;
>  
> -	list_for_each_entry(obj, objects, exec_list) {
> -		ret = i915_gem_object_sync(obj, ring);
> +	list_for_each_entry(vma, vmas, exec_list) {
> +		ret = i915_gem_object_sync(vma->obj, ring);
>  		if (ret)
>  			return ret;
>  
> -		if (obj->base.write_domain & I915_GEM_DOMAIN_CPU)
> -			i915_gem_clflush_object(obj);
> +		if (vma->obj->base.write_domain & I915_GEM_DOMAIN_CPU)
> +			i915_gem_clflush_object(vma->obj);
>  
> -		flush_domains |= obj->base.write_domain;
> +		flush_domains |= vma->obj->base.write_domain;
>  	}
>  
>  	if (flush_domains & I915_GEM_DOMAIN_CPU)
> @@ -793,13 +817,13 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
>  }
>  
>  static void
> -i915_gem_execbuffer_move_to_active(struct list_head *objects,
> -				   struct i915_address_space *vm,
> +i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  				   struct intel_ring_buffer *ring)
>  {
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  
> -	list_for_each_entry(obj, objects, exec_list) {
> +	list_for_each_entry(vma, vmas, exec_list) {
> +		struct drm_i915_gem_object *obj = vma->obj;
>  		u32 old_read = obj->base.read_domains;
>  		u32 old_write = obj->base.write_domain;
>  
> @@ -810,7 +834,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
>  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>  
>  		/* FIXME: This lookup gets fixed later <-- danvet */
> -		list_move_tail(&i915_gem_obj_to_vma(obj, vm)->mm_list, &vm->active_list);
> +		list_move_tail(&vma->mm_list, &vma->vm->active_list);
>  		i915_gem_object_move_to_active(obj, ring);
>  		if (obj->base.write_domain) {
>  			obj->dirty = 1;
> @@ -869,7 +893,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct eb_objects *eb;
> +	struct eb_vmas *eb;
>  	struct drm_i915_gem_object *batch_obj;
>  	struct drm_clip_rect *cliprects = NULL;
>  	struct intel_ring_buffer *ring;
> @@ -1009,7 +1033,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		goto pre_mutex_err;
>  	}
>  
> -	eb = eb_create(args);
> +	eb = eb_create(args, vm);
>  	if (eb == NULL) {
>  		mutex_unlock(&dev->struct_mutex);
>  		ret = -ENOMEM;
> @@ -1017,18 +1041,16 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	}
>  
>  	/* Look up object handles */
> -	ret = eb_lookup_objects(eb, exec, args, vm, file);
> +	ret = eb_lookup_vmas(eb, exec, args, vm, file);
>  	if (ret)
>  		goto err;
>  
>  	/* take note of the batch buffer before we might reorder the lists */
> -	batch_obj = list_entry(eb->objects.prev,
> -			       struct drm_i915_gem_object,
> -			       exec_list);
> +	batch_obj = list_entry(eb->vmas.prev, struct i915_vma, exec_list)->obj;
>  
>  	/* Move the objects en-masse into the GTT, evicting if necessary. */
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->vmas, &need_relocs);
>  	if (ret)
>  		goto err;
>  
> @@ -1038,7 +1060,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	if (ret) {
>  		if (ret == -EFAULT) {
>  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> -								eb, exec, vm);
> +								eb, exec);
>  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>  		}
>  		if (ret)
> @@ -1060,7 +1082,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
>  		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
>  
> -	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->objects);
> +	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
>  	if (ret)
>  		goto err;
>  
> @@ -1115,7 +1137,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
>  
> -	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> +	i915_gem_execbuffer_move_to_active(&eb->vmas, ring);
>  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
>  
>  err:
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/29] drm/i915: Convert execbuf code to use vmas
  2013-08-06 20:43   ` Daniel Vetter
@ 2013-08-06 20:45     ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 20:45 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 10:43:08PM +0200, Daniel Vetter wrote:
> On Wed, Jul 31, 2013 at 05:00:18PM -0700, Ben Widawsky wrote:
> > This attempts to convert all the execbuf code to speak in vmas. Since
> > the execbuf code is very self contained it was a nice isolated
> > conversion.
> > 
> > The meat of the code is about turning eb_objects into eb_vma, and then
> > wiring up the rest of the code to use vmas instead of obj, vm pairs.
> > 
> > Unfortunately, to do this, we must move the exec_list link from the obj
> > structure. This list is reused in the eviction code, so we must also
> > modify the eviction code to make this work.
> > 
> > v2: Release table lock early, and two a 2 phase vma lookup to avoid
> > having to use a GFP_ATOMIC. (Chris)
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> I think the leaking of preallocated vmas if execbuf fails can blow up:
> 1. We call lookup_or_create and create new vmas, linked into the vma_link
> chain.
> 2. Later on execbuf fails somewhere (for an igt the simplest way is
> probaly to use more buffers than what would fit into the gtt) and we bail
> out.
> -> Note that at this point we leak vmas which are on the vma_link list but
> which have no gtt node allocation.
> 3. Userspace dies in flames (or just quits).
> 4. All buffers get their final unref and we call vma_unbind on each vma,
> even the ones that do not have an allocation.
> 5. hilarity ensues since vma_unbind doesn't bail out if
> drm_mm_node_allocated(vma->node) == false.
> 
> We need broken userspace to actually exercise this bug since all normal
> ways for execbuf to bail out involve singals and ioclt restarting. If this
> is a real bug I think we need an igt to exercise it.

Forgotten to mention: Beside that error path the patch looks good. Same
applies to patches 21-24 before.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 26/29] drm/i915: Convert active API to VMA
  2013-08-01  0:00 ` [PATCH 26/29] drm/i915: Convert active API to VMA Ben Widawsky
@ 2013-08-06 20:47   ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 20:47 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:19PM -0700, Ben Widawsky wrote:
> Even though we track object activeness and not VMA, because we have the
> active_list be based on the VM, it makes the most sense to use VMAs in
> the APIs.
> 
> NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
> now, leave them be.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Ah, here's the patch that addresses two of my earlier questions. I guess
the split was to due the execbuf conversion that needed to happen first.
Looks good.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  5 ++---
>  drivers/gpu/drm/i915/i915_gem.c            | 11 +++++++++--
>  drivers/gpu/drm/i915/i915_gem_context.c    |  8 ++++----
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 +---
>  4 files changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index ee5164e..695f1e5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1735,9 +1735,8 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  			 struct intel_ring_buffer *to);
> -void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> -				    struct intel_ring_buffer *ring);
> -
> +void i915_vma_move_to_active(struct i915_vma *vma,
> +			     struct intel_ring_buffer *ring);
>  int i915_gem_dumb_create(struct drm_file *file_priv,
>  			 struct drm_device *dev,
>  			 struct drm_mode_create_dumb *args);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a4ba819..24c1a91 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1866,11 +1866,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  	return 0;
>  }
>  
> -void
> +static void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  			       struct intel_ring_buffer *ring)
>  {
> -	struct drm_device *dev = obj->base.dev;
> +	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	u32 seqno = intel_ring_get_seqno(ring);
>  
> @@ -1905,6 +1905,13 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  	}
>  }
>  
> +void i915_vma_move_to_active(struct i915_vma *vma,
> +			     struct intel_ring_buffer *ring)
> +{
> +	list_move_tail(&vma->mm_list, &vma->vm->active_list);
> +	return i915_gem_object_move_to_active(vma->obj, ring);
> +}
> +
>  static void
>  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 88b0f52..147399c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -436,11 +436,11 @@ static int do_switch(struct i915_hw_context *to)
>  	 * MI_SET_CONTEXT instead of when the next seqno has completed.
>  	 */
>  	if (from != NULL) {
> -		struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
> -		struct i915_address_space *ggtt = &dev_priv->gtt.base;
> +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +		struct i915_vma *vma =
> +			i915_gem_obj_to_vma(from->obj, &dev_priv->gtt.base);
>  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> -		list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
> -		i915_gem_object_move_to_active(from->obj, ring);
> +		i915_vma_move_to_active(vma, ring);
>  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
>  		 * whole damn pipeline, we don't need to explicitly mark the
>  		 * object dirty. The only exception is that the context must be
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 1c9d504..b8bb7f5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -833,9 +833,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  		obj->base.read_domains = obj->base.pending_read_domains;
>  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>  
> -		/* FIXME: This lookup gets fixed later <-- danvet */
> -		list_move_tail(&vma->mm_list, &vma->vm->active_list);
> -		i915_gem_object_move_to_active(obj, ring);
> +		i915_vma_move_to_active(vma, ring);
>  		if (obj->base.write_domain) {
>  			obj->dirty = 1;
>  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 28/29] drm/i915: Use the new vm [un]bind functions
  2013-08-01  0:00 ` [PATCH 28/29] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-08-06 20:58   ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 20:58 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:21PM -0700, Ben Widawsky wrote:
> Building on the last patch which created the new function pointers in
> the VM for bind/unbind, here we actually put those new function pointers
> to use.
> 
> Split out as a separate patch to aid in review. I'm fine with squashing
> into the previous patch if people request it.
> 
> v2: Updated to address the smart ggtt which can do aliasing as needed
> Make sure we bind to global gtt when mappable and fenceable. I thought
> we could get away without this initialy, but we cannot.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

I don't like how this is split since it's a (small but still) flip the
world approach: First you create completely new code, then you rip out the
old one and switch over. So this should definitely be squashed for easier
review, and if its to big the split up into different refactoring steps
(where each steps keeps all the code working while slowly transforming
it). I'll punt for now on this here.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  9 ------
>  drivers/gpu/drm/i915/i915_gem.c            | 31 ++++++++-----------
>  drivers/gpu/drm/i915/i915_gem_context.c    |  8 +++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 ++++++++----------
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++----------------------------
>  5 files changed, 34 insertions(+), 91 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2849297..a9c3110 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1938,17 +1938,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>  
>  /* i915_gem_gtt.c */
>  void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
> -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> -			    struct drm_i915_gem_object *obj,
> -			    enum i915_cache_level cache_level);
> -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> -			      struct drm_i915_gem_object *obj);
> -
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> -				enum i915_cache_level cache_level);
> -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
>  void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
>  void i915_gem_init_global_gtt(struct drm_device *dev);
>  void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 24c1a91..1f35ae4 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2631,12 +2631,8 @@ int i915_vma_unbind(struct i915_vma *vma)
>  
>  	trace_i915_vma_unbind(vma);
>  
> -	if (obj->has_global_gtt_mapping && i915_is_ggtt(vma->vm))
> -		i915_gem_gtt_unbind_object(obj);
> -	if (obj->has_aliasing_ppgtt_mapping) {
> -		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> -		obj->has_aliasing_ppgtt_mapping = 0;
> -	}
> +	vma->vm->unbind_vma(vma);
> +
>  	i915_gem_gtt_finish_object(obj);
>  	i915_gem_object_unpin_pages(obj);
>  
> @@ -3354,7 +3350,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  				    enum i915_cache_level cache_level)
>  {
>  	struct drm_device *dev = obj->base.dev;
> -	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct i915_vma *vma;
>  	int ret;
>  
> @@ -3393,11 +3388,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  				return ret;
>  		}
>  
> -		if (obj->has_global_gtt_mapping)
> -			i915_gem_gtt_bind_object(obj, cache_level);
> -		if (obj->has_aliasing_ppgtt_mapping)
> -			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> -					       obj, cache_level);
> +		list_for_each_entry(vma, &obj->vma_list, vma_link)
> +			vma->vm->bind_vma(vma, cache_level, 0);
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -3676,6 +3668,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  		    bool map_and_fenceable,
>  		    bool nonblocking)
>  {
> +	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
>  	struct i915_vma *vma;
>  	int ret;
>  
> @@ -3704,20 +3697,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	}
>  
>  	if (!i915_gem_obj_bound(obj, vm)) {
> -		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> -
>  		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
>  						 map_and_fenceable,
>  						 nonblocking);
>  		if (ret)
>  			return ret;
>  
> -		if (!dev_priv->mm.aliasing_ppgtt)
> -			i915_gem_gtt_bind_object(obj, obj->cache_level);
> -	}
> +		vma = i915_gem_obj_to_vma(obj, vm);
> +		vm->bind_vma(vma, obj->cache_level, flags);
> +	} else
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  
> +	/* Objects are created map and fenceable. If we bind an object
> +	 * the first time, and we had aliasing PPGTT (and didn't request
> +	 * GLOBAL), we'll need to do this on the second bind.*/
>  	if (!obj->has_global_gtt_mapping && map_and_fenceable)
> -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> +		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
>  
>  	obj->pin_count++;
>  	obj->pin_mappable |= map_and_fenceable;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 147399c..10a5618 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -391,6 +391,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  static int do_switch(struct i915_hw_context *to)
>  {
>  	struct intel_ring_buffer *ring = to->ring;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct i915_hw_context *from = ring->last_context;
>  	u32 hw_flags = 0;
>  	int ret;
> @@ -415,8 +416,11 @@ static int do_switch(struct i915_hw_context *to)
>  		return ret;
>  	}
>  
> -	if (!to->obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
> +	if (!to->obj->has_global_gtt_mapping) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
> +							   &dev_priv->gtt.base);
> +		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
> +	}
>  
>  	if (!to->is_initialized || is_default_context(to))
>  		hw_flags |= MI_RESTORE_INHIBIT;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index b8bb7f5..4719e74 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -230,8 +230,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  	if (unlikely(IS_GEN6(dev) &&
>  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
>  	    !target_i915_obj->has_global_gtt_mapping)) {
> -		i915_gem_gtt_bind_object(target_i915_obj,
> -					 target_i915_obj->cache_level);
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		vma->vm->bind_vma(vma, target_i915_obj->cache_level,
> +				 GLOBAL_BIND);
>  	}
>  
>  	/* Validate that the target is in a valid r/w GPU domain */
> @@ -434,11 +435,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
>  				struct intel_ring_buffer *ring,
>  				bool *need_reloc)
>  {
> -	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct drm_i915_gem_object *obj = vma->obj;
>  	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
>  	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
>  	bool need_fence, need_mappable;
> -	struct drm_i915_gem_object *obj = vma->obj;
> +	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
> +		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
>  	int ret;
>  
>  	need_fence =
> @@ -467,14 +469,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
>  		}
>  	}
>  
> -	/* Ensure ppgtt mapping exists if needed */
> -	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> -		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> -				       obj, obj->cache_level);
> -
> -		obj->has_aliasing_ppgtt_mapping = 1;
> -	}
> -
>  	if (entry->offset != vma->node.start) {
>  		entry->offset = vma->node.start;
>  		*need_reloc = true;
> @@ -485,9 +479,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
>  		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
>  	}
>  
> -	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
> -	    !obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> +	vma->vm->bind_vma(vma, obj->cache_level, flags);
>  
>  	return 0;
>  }
> @@ -1077,8 +1069,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
>  	 * hsw should have this fixed, but let's be paranoid and do it
>  	 * unconditionally for now. */
> -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> +	if (flags & I915_DISPATCH_SECURE &&
> +	    !batch_obj->has_global_gtt_mapping) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
> +		vm->bind_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> +	}
>  
>  	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
>  	if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 39ac266..74b5077 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -412,15 +412,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
>  	dev_priv->mm.aliasing_ppgtt = NULL;
>  }
>  
> -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> -			    struct drm_i915_gem_object *obj,
> -			    enum i915_cache_level cache_level)
> -{
> -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				   cache_level);
> -}
> -
>  static void __always_unused
>  gen6_ppgtt_bind_vma(struct i915_vma *vma,
>  		    enum i915_cache_level cache_level,
> @@ -433,14 +424,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
>  	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
>  }
>  
> -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> -			      struct drm_i915_gem_object *obj)
> -{
> -	ppgtt->base.clear_range(&ppgtt->base,
> -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				obj->base.size >> PAGE_SHIFT);
> -}
> -
>  static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
>  {
>  	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> @@ -501,8 +484,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> +							   &dev_priv->gtt.base);
>  		i915_gem_clflush_object(obj);
> -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> +		vma->vm->bind_vma(vma, obj->cache_level, 0);
>  	}
>  
>  	i915_gem_chipset_flush(dev);
> @@ -658,33 +643,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
>  	}
>  }
>  
> -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> -			      enum i915_cache_level cache_level)
> -{
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> -
> -	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
> -					  entry,
> -					  cache_level);
> -
> -	obj->has_global_gtt_mapping = 1;
> -}
> -
> -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> -{
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> -
> -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -				       entry,
> -				       obj->base.size >> PAGE_SHIFT);
> -
> -	obj->has_global_gtt_mapping = 0;
> -}
> -
>  static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
>  {
>  	struct drm_device *dev = vma->vm->dev;
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-08-06 18:39   ` Daniel Vetter
@ 2013-08-06 21:27     ` Ben Widawsky
  2013-08-06 21:29       ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-06 21:27 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 08:39:50PM +0200, Daniel Vetter wrote:
> On Wed, Jul 31, 2013 at 05:00:11PM -0700, Ben Widawsky wrote:
> > Eviction code, like the rest of the converted code needs to be aware of
> > the address space for which it is evicting (or the everything case, all
> > addresses). With the updated bind/unbind interfaces of the last patch,
> > we can now safely move the eviction code over.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Two comments below.
> -Daniel
> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h       |  4 ++-
> >  drivers/gpu/drm/i915/i915_gem.c       |  2 +-
> >  drivers/gpu/drm/i915/i915_gem_evict.c | 53 +++++++++++++++++++----------------
> >  3 files changed, 33 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 0610588..bf1ecef 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1946,7 +1946,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> >  
> >  
> >  /* i915_gem_evict.c */
> > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > +					  struct i915_address_space *vm,
> > +					  int min_size,
> >  					  unsigned alignment,
> >  					  unsigned cache_level,
> >  					  bool mappable,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 0cb36c2..1013105 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3159,7 +3159,7 @@ search_free:
> >  						  size, alignment,
> >  						  obj->cache_level, 0, gtt_max);
> >  	if (ret) {
> > -		ret = i915_gem_evict_something(dev, size, alignment,
> > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> >  					       obj->cache_level,
> >  					       map_and_fenceable,
> >  					       nonblocking);
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index 9205a41..61bf5e2 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -32,26 +32,21 @@
> >  #include "i915_trace.h"
> >  
> >  static bool
> > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> >  {
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_vma *vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
> > -
> > -	if (obj->pin_count)
> > +	if (vma->obj->pin_count)
> >  		return false;
> >  
> > -	list_add(&obj->exec_list, unwind);
> > +	list_add(&vma->obj->exec_list, unwind);
> >  	return drm_mm_scan_add_block(&vma->node);
> >  }
> >  
> >  int
> > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > -			 unsigned alignment, unsigned cache_level,
> > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > +			 int min_size, unsigned alignment, unsigned cache_level,
> >  			 bool mappable, bool nonblocking)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	struct list_head eviction_list, unwind_list;
> >  	struct i915_vma *vma;
> >  	struct drm_i915_gem_object *obj;
> > @@ -83,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  	 */
> >  
> >  	INIT_LIST_HEAD(&unwind_list);
> > -	if (mappable)
> > +	if (mappable) {
> > +		BUG_ON(!i915_is_ggtt(vm));
> >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> >  					    alignment, cache_level, 0,
> >  					    dev_priv->gtt.mappable_end);
> > -	else
> > +	} else
> >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> >  
> >  	/* First see if there is a large enough contiguous idle region... */
> >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -101,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  
> >  	/* Now merge in the soon-to-be-expired objects... */
> >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -111,7 +109,7 @@ none:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		ret = drm_mm_scan_remove_block(&vma->node);
> >  		BUG_ON(ret);
> >  
> > @@ -132,7 +130,7 @@ found:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = i915_gem_obj_to_vma(obj, &dev_priv->gtt.base);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		if (drm_mm_scan_remove_block(&vma->node)) {
> >  			list_move(&obj->exec_list, &eviction_list);
> >  			drm_gem_object_reference(&obj->base);
> > @@ -147,7 +145,7 @@ found:
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> >  		if (ret == 0)
> > -			ret = i915_gem_object_ggtt_unbind(obj);
> > +			ret = i915_vma_unbind(i915_gem_obj_to_vma(obj, vm));
> 
> Again I think the ggtt_unbind->vma_unbind conversion seems to leak the
> vma. It feels like vma_unbind should call vma_destroy?
> 

The VMA should get cleaned up when an object backing the vmas is
destroyed also. I agree that at present, unbind and destroy have little
distinction, but I can foresee that changing. Proper faulting is one
such case OTTOMH

Anyway, let me know if your leak concern is addressed.

> >  
> >  		list_del_init(&obj->exec_list);
> >  		drm_gem_object_unreference(&obj->base);
> > @@ -160,13 +158,18 @@ int
> >  i915_gem_evict_everything(struct drm_device *dev)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj, *next;
> > -	bool lists_empty;
> > +	bool lists_empty = true;
> >  	int ret;
> >  
> > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > -		       list_empty(&vm->active_list));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > +			       list_empty(&vm->active_list));
> > +		if (!lists_empty)
> > +			lists_empty = false;
> > +	}
> > +
> >  	if (lists_empty)
> >  		return -ENOSPC;
> >  
> > @@ -183,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  	i915_gem_retire_requests(dev);
> >  
> >  	/* Having flushed everything, unbind() should never raise an error */
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0)
> > -			WARN_ON(i915_gem_object_ggtt_unbind(obj));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > +			if (obj->pin_count == 0)
> > +				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
> > +	}
> 
> The conversion of evict_everything looks a bit strange. Essentially we
> have tree callers:
> - ums+gem support code in leavevt to rid the gtt of all gem objects when
>   the userspace X ums ddx stops controlling the hw.
> - When we seriously ran out of memory in, shrink_all.
> - In execbuf when we've fragmented the gtt address space so badly that we
>   need to start over completely fresh.
> 
> With this it imo would make sense to just loop over the global bound
> object lists. But for the execbuf caller adding a vm parameter (and only
> evicting from that special vm, skipping all others) would make sense.
> Other callers would pass NULL since they want everything to get evicted.
> Volunteered for that follow-up?
> 

We need evict_everything for #1. For #2, we call evict_something already
for the vm when we go through the out of space path. If that failed,
evicting everything for a specific VM is just the same operation. But
maybe I've glossed over something in the details. Please correct if I'm
wrong. Is there a case where evict something might fail with ENOSPC, and
evict everything in a VM would help?

> >  
> >  	return 0;
> >  }
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any
  2013-08-06 18:43   ` Daniel Vetter
@ 2013-08-06 21:29     ` Ben Widawsky
  0 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-06 21:29 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 08:43:42PM +0200, Daniel Vetter wrote:
> On Wed, Jul 31, 2013 at 05:00:12PM -0700, Ben Widawsky wrote:
> > In some places, we want to know if an object is bound in any address
> > space, and not just the global GTT. This often applies when there is a
> > single global resource (object, pages, etc.)
> > 
> > function                             |      reason
> > --------------------------------------------------
> > i915_gem_object_is_inactive          | global object
> > i915_gem_object_put_pages            | object's pages
> > 915_gem_object_unpin                 | global object
> > i915_gem_execbuffer_unreserve_object | temporary until we plumb vma
> > pread/pread                          | object's domain
> 
> pread/pwrite isn't about the object's domain at all, but purely about
> synchronizing for outstanding rendering. Replacing the call to
> set_to_gtt_domain with a wait_rendering would imo improve code
> readability. Furthermore we could pimp pread to only block for outstanding
> writes and not for reads.

I might have been the first to trip over it, but this isn't my first
instance ;-).

> 
> Since you're not the first one to trip over this: Can I volunteer you for
> a follow-up patch to fix this?

Working on it now.

> 
> Otherwise patch looks good.
> -Daniel
> 
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c            | 12 ++++++------
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
> >  2 files changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 1013105..d4d6444 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -122,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> >  static inline bool
> >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> >  {
> > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> >  }
> >  
> >  int
> > @@ -408,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> >  		 * anyway again before the next pread happens. */
> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> >  			if (ret)
> >  				return ret;
> > @@ -725,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> >  		 * right away and we therefore have to clflush anyway. */
> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush_after = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> >  			if (ret)
> >  				return ret;
> > @@ -1659,7 +1659,7 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> >  	if (obj->pages_pin_count)
> >  		return -EBUSY;
> >  
> > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > +	BUG_ON(i915_gem_obj_bound_any(obj));
> >  
> >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> >  	 * array, hence protect them from being reaped by removing them from gtt
> > @@ -3301,7 +3301,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  	int ret;
> >  
> >  	/* Not valid to be called on unbound objects. */
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return -EINVAL;
> >  
> >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > @@ -3725,7 +3725,7 @@ void
> >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> >  {
> >  	BUG_ON(obj->pin_count == 0);
> > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> >  
> >  	if (--obj->pin_count == 0)
> >  		obj->pin_mappable = false;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 5e68f1e..64dc6b5 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -466,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_i915_gem_exec_object2 *entry;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return;
> >  
> >  	entry = obj->exec_entry;
> > -- 
> > 1.8.3.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-08-06 21:27     ` Ben Widawsky
@ 2013-08-06 21:29       ` Daniel Vetter
  2013-08-06 22:57         ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 21:29 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 02:27:39PM -0700, Ben Widawsky wrote:
> On Tue, Aug 06, 2013 at 08:39:50PM +0200, Daniel Vetter wrote:
> > On Wed, Jul 31, 2013 at 05:00:11PM -0700, Ben Widawsky wrote:
> > > @@ -183,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  	i915_gem_retire_requests(dev);
> > >  
> > >  	/* Having flushed everything, unbind() should never raise an error */
> > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0)
> > > -			WARN_ON(i915_gem_object_ggtt_unbind(obj));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > +			if (obj->pin_count == 0)
> > > +				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
> > > +	}
> > 
> > The conversion of evict_everything looks a bit strange. Essentially we
> > have tree callers:
> > - ums+gem support code in leavevt to rid the gtt of all gem objects when
> >   the userspace X ums ddx stops controlling the hw.
> > - When we seriously ran out of memory in, shrink_all.
> > - In execbuf when we've fragmented the gtt address space so badly that we
> >   need to start over completely fresh.
> > 
> > With this it imo would make sense to just loop over the global bound
> > object lists. But for the execbuf caller adding a vm parameter (and only
> > evicting from that special vm, skipping all others) would make sense.
> > Other callers would pass NULL since they want everything to get evicted.
> > Volunteered for that follow-up?
> > 
> 
> We need evict_everything for #1. For #2, we call evict_something already
> for the vm when we go through the out of space path. If that failed,
> evicting everything for a specific VM is just the same operation. But
> maybe I've glossed over something in the details. Please correct if I'm
> wrong. Is there a case where evict something might fail with ENOSPC, and
> evict everything in a VM would help?

Yes, when we've terminally fragmented the gtt we kick out everything and
start over. That's the 3rd usecase.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-08-06 21:29       ` Daniel Vetter
@ 2013-08-06 22:57         ` Ben Widawsky
  2013-08-06 22:59           ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-06 22:57 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 11:29:49PM +0200, Daniel Vetter wrote:
> On Tue, Aug 06, 2013 at 02:27:39PM -0700, Ben Widawsky wrote:
> > On Tue, Aug 06, 2013 at 08:39:50PM +0200, Daniel Vetter wrote:
> > > On Wed, Jul 31, 2013 at 05:00:11PM -0700, Ben Widawsky wrote:
> > > > @@ -183,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  	i915_gem_retire_requests(dev);
> > > >  
> > > >  	/* Having flushed everything, unbind() should never raise an error */
> > > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0)
> > > > -			WARN_ON(i915_gem_object_ggtt_unbind(obj));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > +			if (obj->pin_count == 0)
> > > > +				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
> > > > +	}
> > > 
> > > The conversion of evict_everything looks a bit strange. Essentially we
> > > have tree callers:
> > > - ums+gem support code in leavevt to rid the gtt of all gem objects when
> > >   the userspace X ums ddx stops controlling the hw.
> > > - When we seriously ran out of memory in, shrink_all.
> > > - In execbuf when we've fragmented the gtt address space so badly that we
> > >   need to start over completely fresh.
> > > 
> > > With this it imo would make sense to just loop over the global bound
> > > object lists. But for the execbuf caller adding a vm parameter (and only
> > > evicting from that special vm, skipping all others) would make sense.
> > > Other callers would pass NULL since they want everything to get evicted.
> > > Volunteered for that follow-up?
> > > 
> > 
> > We need evict_everything for #1. For #2, we call evict_something already
> > for the vm when we go through the out of space path. If that failed,
> > evicting everything for a specific VM is just the same operation. But
> > maybe I've glossed over something in the details. Please correct if I'm
> > wrong. Is there a case where evict something might fail with ENOSPC, and
> > evict everything in a VM would help?
> 
> Yes, when we've terminally fragmented the gtt we kick out everything and
> start over. That's the 3rd usecase.
> -Daniel

I am not seeing it. To me evict_something is what you want, and the fix
for wherever the 3rd usecase is (please point it out, I'm dense) is it
should call evict_something, not evict_everything.

If by GTT you mean the aperture... that's kind of a different can of
worms completely. In that case I don't think you want to do anything per
VM, though potentially you can do it that way and be a little fairer.


-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-08-06 22:57         ` Ben Widawsky
@ 2013-08-06 22:59           ` Daniel Vetter
  2013-08-06 23:25             ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 22:59 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Aug 7, 2013 at 12:57 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
>> > We need evict_everything for #1. For #2, we call evict_something already
>> > for the vm when we go through the out of space path. If that failed,
>> > evicting everything for a specific VM is just the same operation. But
>> > maybe I've glossed over something in the details. Please correct if I'm
>> > wrong. Is there a case where evict something might fail with ENOSPC, and
>> > evict everything in a VM would help?
>>
>> Yes, when we've terminally fragmented the gtt we kick out everything and
>> start over. That's the 3rd usecase.
>
> I am not seeing it. To me evict_something is what you want, and the fix
> for wherever the 3rd usecase is (please point it out, I'm dense) is it
> should call evict_something, not evict_everything.

See the call to evict_everything in
i915_gem_execbuffer.c:i915_gem_execbuffer_reserve

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-08-06 22:59           ` Daniel Vetter
@ 2013-08-06 23:25             ` Ben Widawsky
  2013-08-06 23:44               ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-06 23:25 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Wed, Aug 07, 2013 at 12:59:29AM +0200, Daniel Vetter wrote:
> On Wed, Aug 7, 2013 at 12:57 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> >> > We need evict_everything for #1. For #2, we call evict_something already
> >> > for the vm when we go through the out of space path. If that failed,
> >> > evicting everything for a specific VM is just the same operation. But
> >> > maybe I've glossed over something in the details. Please correct if I'm
> >> > wrong. Is there a case where evict something might fail with ENOSPC, and
> >> > evict everything in a VM would help?
> >>
> >> Yes, when we've terminally fragmented the gtt we kick out everything and
> >> start over. That's the 3rd usecase.
> >
> > I am not seeing it. To me evict_something is what you want, and the fix
> > for wherever the 3rd usecase is (please point it out, I'm dense) is it
> > should call evict_something, not evict_everything.
> 
> See the call to evict_everything in
> i915_gem_execbuffer.c:i915_gem_execbuffer_reserve
> 

As I was saying in the first response - you only hit this if
evict_something() for a vm fails, right? That's the way ret == ENOSPC
AFAICT.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-08-06 23:25             ` Ben Widawsky
@ 2013-08-06 23:44               ` Daniel Vetter
  2013-08-07 18:24                 ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-06 23:44 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Aug 7, 2013 at 1:25 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Wed, Aug 07, 2013 at 12:59:29AM +0200, Daniel Vetter wrote:
>> On Wed, Aug 7, 2013 at 12:57 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
>> >> > We need evict_everything for #1. For #2, we call evict_something already
>> >> > for the vm when we go through the out of space path. If that failed,
>> >> > evicting everything for a specific VM is just the same operation. But
>> >> > maybe I've glossed over something in the details. Please correct if I'm
>> >> > wrong. Is there a case where evict something might fail with ENOSPC, and
>> >> > evict everything in a VM would help?
>> >>
>> >> Yes, when we've terminally fragmented the gtt we kick out everything and
>> >> start over. That's the 3rd usecase.
>> >
>> > I am not seeing it. To me evict_something is what you want, and the fix
>> > for wherever the 3rd usecase is (please point it out, I'm dense) is it
>> > should call evict_something, not evict_everything.
>>
>> See the call to evict_everything in
>> i915_gem_execbuffer.c:i915_gem_execbuffer_reserve
>>
>
> As I was saying in the first response - you only hit this if
> evict_something() for a vm fails, right? That's the way ret == ENOSPC
> AFAICT.

Like I've said if we can't fit a batch we do a last ditch effort of
evicting everything and starting over anew. That's also what the retry
logic in there is for. This happens after evict_something failed.
Dunno what exactly isn't clear or what's confusing ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 21/29] drm/i915: mm_list is per VMA
  2013-08-06 19:38   ` Daniel Vetter
@ 2013-08-07  0:28     ` Ben Widawsky
  2013-08-07 20:52       ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-07  0:28 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 09:38:41PM +0200, Daniel Vetter wrote:
> On Wed, Jul 31, 2013 at 05:00:14PM -0700, Ben Widawsky wrote:
> > formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
> > 
> > The mm_list is used for the active/inactive LRUs. Since those LRUs are
> > per address space, the link should be per VMx .
> > 
> > Because we'll only ever have 1 VMA before this point, it's not incorrect
> > to defer this change until this point in the patch series, and doing it
> > here makes the change much easier to understand.
> > 
> > Shamelessly manipulated out of Daniel:
> > "active/inactive stuff is used by eviction when we run out of address
> > space, so needs to be per-vma and per-address space. Bound/unbound otoh
> > is used by the shrinker which only cares about the amount of memory used
> > and not one bit about in which address space this memory is all used in.
> > Of course to actual kick out an object we need to unbind it from every
> > address space, but for that we have the per-object list of vmas."
> > 
> > v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
> > 
> > v3: Moved earlier in the series
> > 
> > v4: Add dropped message from v3
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Some comments below for this one. The lru changes look a bit strange so
> I'll wait for your confirmation that the do_switch hunk has the same
> reasons s the one in execbuf with the FIXME comment.
> -Daniel
> 
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c        | 53 ++++++++++++++++++++----------
> >  drivers/gpu/drm/i915/i915_drv.h            |  5 +--
> >  drivers/gpu/drm/i915/i915_gem.c            | 37 +++++++++++----------
> >  drivers/gpu/drm/i915/i915_gem_context.c    |  3 ++
> >  drivers/gpu/drm/i915/i915_gem_evict.c      | 14 ++++----
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 ++
> >  drivers/gpu/drm/i915/i915_gem_stolen.c     |  2 +-
> >  drivers/gpu/drm/i915/i915_gpu_error.c      | 37 ++++++++++++---------
> >  8 files changed, 91 insertions(+), 62 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 6d5ca85bd..181e5a6 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -149,7 +149,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	struct drm_device *dev = node->minor->dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > -	struct drm_i915_gem_object *obj;
> > +	struct i915_vma *vma;
> >  	size_t total_obj_size, total_gtt_size;
> >  	int count, ret;
> >  
> > @@ -157,6 +157,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	if (ret)
> >  		return ret;
> >  
> > +	/* FIXME: the user of this interface might want more than just GGTT */
> >  	switch (list) {
> >  	case ACTIVE_LIST:
> >  		seq_puts(m, "Active:\n");
> > @@ -172,12 +173,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	}
> >  
> >  	total_obj_size = total_gtt_size = count = 0;
> > -	list_for_each_entry(obj, head, mm_list) {
> > -		seq_puts(m, "   ");
> > -		describe_obj(m, obj);
> > -		seq_putc(m, '\n');
> > -		total_obj_size += obj->base.size;
> > -		total_gtt_size += i915_gem_obj_ggtt_size(obj);
> > +	list_for_each_entry(vma, head, mm_list) {
> > +		seq_printf(m, "   ");
> > +		describe_obj(m, vma->obj);
> > +		seq_printf(m, "\n");
> > +		total_obj_size += vma->obj->base.size;
> > +		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
> 
> Why not use vma->node.size? If you don't disagree I'll bikeshed this while
> applying.
> 

I think in terms of the diff, it's more logical to do it how I did. The
result should damn well be the same though, so go right ahead. When I
set about writing the series, I really didn't want to use
node.size/start directly as much as possible - so we can sneak things
into the helpers as needed.


> >  		count++;
> >  	}
> >  	mutex_unlock(&dev->struct_mutex);
> > @@ -224,7 +225,18 @@ static int per_file_stats(int id, void *ptr, void *data)
> >  	return 0;
> >  }
> >  
> > -static int i915_gem_object_info(struct seq_file *m, void *data)
> > +#define count_vmas(list, member) do { \
> > +	list_for_each_entry(vma, list, member) { \
> > +		size += i915_gem_obj_ggtt_size(vma->obj); \
> > +		++count; \
> > +		if (vma->obj->map_and_fenceable) { \
> > +			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
> > +			++mappable_count; \
> > +		} \
> > +	} \
> > +} while (0)
> > +
> > +static int i915_gem_object_info(struct seq_file *m, void* data)
> >  {
> >  	struct drm_info_node *node = (struct drm_info_node *) m->private;
> >  	struct drm_device *dev = node->minor->dev;
> > @@ -234,6 +246,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> >  	struct drm_i915_gem_object *obj;
> >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	struct drm_file *file;
> > +	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> > @@ -253,12 +266,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> >  		   count, mappable_count, size, mappable_size);
> >  
> >  	size = count = mappable_size = mappable_count = 0;
> > -	count_objects(&vm->active_list, mm_list);
> > +	count_vmas(&vm->active_list, mm_list);
> >  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
> >  		   count, mappable_count, size, mappable_size);
> >  
> >  	size = count = mappable_size = mappable_count = 0;
> > -	count_objects(&vm->inactive_list, mm_list);
> > +	count_vmas(&vm->inactive_list, mm_list);
> >  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
> >  		   count, mappable_count, size, mappable_size);
> >  
> > @@ -1771,7 +1784,8 @@ i915_drop_caches_set(void *data, u64 val)
> >  	struct drm_device *dev = data;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_object *obj, *next;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> > +	struct i915_vma *vma, *x;
> >  	int ret;
> >  
> >  	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
> > @@ -1792,13 +1806,16 @@ i915_drop_caches_set(void *data, u64 val)
> >  		i915_gem_retire_requests(dev);
> >  
> >  	if (val & DROP_BOUND) {
> > -		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > -					 mm_list) {
> > -			if (obj->pin_count)
> > -				continue;
> > -			ret = i915_gem_object_ggtt_unbind(obj);
> > -			if (ret)
> > -				goto unlock;
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +			list_for_each_entry_safe(vma, x, &vm->inactive_list,
> > +						 mm_list) {
> 
> Imo the double-loop is a bit funny, looping over the global bound list
> and skipping all active objects is imo the more straightfoward logic. But
> I agree that this is the more straightforward conversion, so I'm ok with a
> follow-up fixup patch.
> 

I guess we have a lot of such conversions. I don't really mind the
change, just a bit worried that it's less tested than what I've already
done. I'm also not yet convinced the result will be a huge improvement
for readability, but I've been starting at these lists for so long, my
opinion is quite biased.

I guess we'll have to see. I've made a note to myself to look into
converting all these types of loops over, but as we should see little to
no functional impact from the change, I'd like to hold off until we get
the rest merged.

> > +				if (vma->obj->pin_count)
> > +					continue;
> > +
> > +				ret = i915_vma_unbind(vma);
> > +				if (ret)
> > +					goto unlock;
> > +			}
> >  		}
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index bf1ecef..220699b 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -546,6 +546,9 @@ struct i915_vma {
> >  	struct drm_i915_gem_object *obj;
> >  	struct i915_address_space *vm;
> >  
> > +	/** This object's place on the active/inactive lists */
> > +	struct list_head mm_list;
> > +
> >  	struct list_head vma_link; /* Link in the object's VMA list */
> >  };
> >  
> > @@ -1263,9 +1266,7 @@ struct drm_i915_gem_object {
> >  	struct drm_mm_node *stolen;
> >  	struct list_head global_list;
> >  
> > -	/** This object's place on the active/inactive lists */
> >  	struct list_head ring_list;
> > -	struct list_head mm_list;
> >  	/** This object's place in the batchbuffer or on the eviction list */
> >  	struct list_head exec_list;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index ec23a5c..fb3f02f 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1872,7 +1872,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 seqno = intel_ring_get_seqno(ring);
> >  
> >  	BUG_ON(ring == NULL);
> > @@ -1888,8 +1887,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  		obj->active = 1;
> >  	}
> >  
> > -	/* Move from whatever list we were on to the tail of execution. */
> > -	list_move_tail(&obj->mm_list, &vm->active_list);
> >  	list_move_tail(&obj->ring_list, &ring->active_list);
> >  
> >  	obj->last_read_seqno = seqno;
> > @@ -1911,14 +1908,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  static void
> >  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> >  {
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> >  
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> >  	BUG_ON(!obj->active);
> >  
> > -	list_move_tail(&obj->mm_list, &vm->inactive_list);
> > +	list_move_tail(&vma->mm_list, &ggtt_vm->inactive_list);
> >  
> >  	list_del_init(&obj->ring_list);
> >  	obj->ring = NULL;
> > @@ -2286,9 +2283,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
> >  void i915_gem_reset(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm;
> > -	struct drm_i915_gem_object *obj;
> >  	struct intel_ring_buffer *ring;
> > +	struct i915_address_space *vm;
> > +	struct i915_vma *vma;
> >  	int i;
> >  
> >  	for_each_ring(ring, dev_priv, i)
> > @@ -2298,8 +2295,8 @@ void i915_gem_reset(struct drm_device *dev)
> >  	 * necessary invalidation upon reuse.
> >  	 */
> >  	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > -		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > +		list_for_each_entry(vma, &vm->inactive_list, mm_list)
> > +			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> >  
> >  	i915_gem_restore_fences(dev);
> >  }
> > @@ -2353,6 +2350,7 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> >  			break;
> >  
> > +		BUG_ON(!obj->active);
> >  		i915_gem_object_move_to_inactive(obj);
> >  	}
> >  
> > @@ -2635,7 +2633,6 @@ int i915_vma_unbind(struct i915_vma *vma)
> >  	i915_gem_gtt_finish_object(obj);
> >  	i915_gem_object_unpin_pages(obj);
> >  
> > -	list_del(&obj->mm_list);
> >  	/* Avoid an unnecessary call to unbind on rebind. */
> >  	if (i915_is_ggtt(vma->vm))
> >  		obj->map_and_fenceable = true;
> > @@ -3180,7 +3177,7 @@ search_free:
> >  		goto err_out;
> >  
> >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > +	list_add_tail(&vma->mm_list, &vm->inactive_list);
> >  
> >  	/* Keep GGTT vmas first to make debug easier */
> >  	if (i915_is_ggtt(vm))
> > @@ -3342,9 +3339,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  					    old_write_domain);
> >  
> >  	/* And bump the LRU for this access */
> > -	if (i915_gem_object_is_inactive(obj))
> > -		list_move_tail(&obj->mm_list,
> > -			       &dev_priv->gtt.base.inactive_list);
> > +	if (i915_gem_object_is_inactive(obj)) {
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> > +							   &dev_priv->gtt.base);
> > +		if (vma)
> > +			list_move_tail(&vma->mm_list,
> > +				       &dev_priv->gtt.base.inactive_list);
> > +
> > +	}
> >  
> >  	return 0;
> >  }
> > @@ -3917,7 +3919,6 @@ unlock:
> >  void i915_gem_object_init(struct drm_i915_gem_object *obj,
> >  			  const struct drm_i915_gem_object_ops *ops)
> >  {
> > -	INIT_LIST_HEAD(&obj->mm_list);
> >  	INIT_LIST_HEAD(&obj->global_list);
> >  	INIT_LIST_HEAD(&obj->ring_list);
> >  	INIT_LIST_HEAD(&obj->exec_list);
> > @@ -4054,6 +4055,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  		return ERR_PTR(-ENOMEM);
> >  
> >  	INIT_LIST_HEAD(&vma->vma_link);
> > +	INIT_LIST_HEAD(&vma->mm_list);
> >  	vma->vm = vm;
> >  	vma->obj = obj;
> >  
> > @@ -4063,6 +4065,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  void i915_gem_vma_destroy(struct i915_vma *vma)
> >  {
> >  	list_del_init(&vma->vma_link);
> > +	list_del(&vma->mm_list);
> >  	drm_mm_remove_node(&vma->node);
> >  	kfree(vma);
> >  }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index d1cb28c..88b0f52 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -436,7 +436,10 @@ static int do_switch(struct i915_hw_context *to)
> >  	 * MI_SET_CONTEXT instead of when the next seqno has completed.
> >  	 */
> >  	if (from != NULL) {
> > +		struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
> > +		struct i915_address_space *ggtt = &dev_priv->gtt.base;
> >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > +		list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
> 
> I don't really see a reason to add this here ... shouldn't move_to_active
> take care of this? Obviously not in this patch here but later on when it's
> converted over.

Yes. You're right - it's sort of an ugly intermediate artifact.

> 
> >  		i915_gem_object_move_to_active(from->obj, ring);
> >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> >  		 * whole damn pipeline, we don't need to explicitly mark the
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index 61bf5e2..425939b 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> >  
> >  	/* First see if there is a large enough contiguous idle region... */
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
> >  		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> > @@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> >  		goto none;
> >  
> >  	/* Now merge in the soon-to-be-expired objects... */
> > -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +	list_for_each_entry(vma, &vm->active_list, mm_list) {
> >  		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> > @@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	struct i915_address_space *vm;
> > -	struct drm_i915_gem_object *obj, *next;
> > +	struct i915_vma *vma, *next;
> >  	bool lists_empty = true;
> >  	int ret;
> >  
> > @@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  
> >  	/* Having flushed everything, unbind() should never raise an error */
> >  	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > -		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > -			if (obj->pin_count == 0)
> > -				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
> > +		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> > +			if (vma->obj->pin_count == 0)
> > +				WARN_ON(i915_vma_unbind(vma));
> >  	}
> >  
> >  	return 0;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 64dc6b5..0f21702 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -801,6 +801,8 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> >  		obj->base.read_domains = obj->base.pending_read_domains;
> >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> >  
> > +		/* FIXME: This lookup gets fixed later <-- danvet */
> > +		list_move_tail(&i915_gem_obj_to_vma(obj, vm)->mm_list, &vm->active_list);
> 
> Ah, I guess the same comment applies to the lru frobbing in do_switch?
> 
> >  		i915_gem_object_move_to_active(obj, ring);
> >  		if (obj->base.write_domain) {
> >  			obj->dirty = 1;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index 000ffbd..fa60103 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	obj->has_global_gtt_mapping = 1;
> >  
> >  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > -	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
> > +	list_add_tail(&vma->mm_list, &ggtt->inactive_list);
> >  
> >  	return obj;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index d970d84..9623a4e 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -556,11 +556,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
> >  static u32 capture_active_bo(struct drm_i915_error_buffer *err,
> >  			     int count, struct list_head *head)
> >  {
> > -	struct drm_i915_gem_object *obj;
> > +	struct i915_vma *vma;
> >  	int i = 0;
> >  
> > -	list_for_each_entry(obj, head, mm_list) {
> > -		capture_bo(err++, obj);
> > +	list_for_each_entry(vma, head, mm_list) {
> > +		capture_bo(err++, vma->obj);
> >  		if (++i == count)
> >  			break;
> >  	}
> > @@ -622,7 +622,8 @@ static struct drm_i915_error_object *
> >  i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  			     struct intel_ring_buffer *ring)
> >  {
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> > +	struct i915_vma *vma;
> >  	struct drm_i915_gem_object *obj;
> >  	u32 seqno;
> >  
> > @@ -642,20 +643,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	seqno = ring->get_seqno(ring, false);
> > -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		if (obj->ring != ring)
> > -			continue;
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		list_for_each_entry(vma, &vm->active_list, mm_list) {
> 
> We could instead loop over the bound list and check for ->active. But this
> is ok, too albeit a bit convoluted imo.
> 
> > +			obj = vma->obj;
> > +			if (obj->ring != ring)
> > +				continue;
> >  
> > -		if (i915_seqno_passed(seqno, obj->last_read_seqno))
> > -			continue;
> > +			if (i915_seqno_passed(seqno, obj->last_read_seqno))
> > +				continue;
> >  
> > -		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> > -			continue;
> > +			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> > +				continue;
> >  
> > -		/* We need to copy these to an anonymous buffer as the simplest
> > -		 * method to avoid being overwritten by userspace.
> > -		 */
> > -		return i915_error_object_create(dev_priv, obj);
> > +			/* We need to copy these to an anonymous buffer as the simplest
> > +			 * method to avoid being overwritten by userspace.
> > +			 */
> > +			return i915_error_object_create(dev_priv, obj);
> > +		}
> >  	}
> >  
> >  	return NULL;
> > @@ -775,11 +779,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> >  				     struct drm_i915_error_state *error)
> >  {
> >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_vma *vma;
> >  	struct drm_i915_gem_object *obj;
> >  	int i;
> >  
> >  	i = 0;
> > -	list_for_each_entry(obj, &vm->active_list, mm_list)
> > +	list_for_each_entry(vma, &vm->active_list, mm_list)
> >  		i++;
> >  	error->active_bo_count = i;
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code
  2013-08-06 23:44               ` Daniel Vetter
@ 2013-08-07 18:24                 ` Ben Widawsky
  0 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-07 18:24 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Wed, Aug 07, 2013 at 01:44:58AM +0200, Daniel Vetter wrote:
> On Wed, Aug 7, 2013 at 1:25 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > On Wed, Aug 07, 2013 at 12:59:29AM +0200, Daniel Vetter wrote:
> >> On Wed, Aug 7, 2013 at 12:57 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> >> >> > We need evict_everything for #1. For #2, we call evict_something already
> >> >> > for the vm when we go through the out of space path. If that failed,
> >> >> > evicting everything for a specific VM is just the same operation. But
> >> >> > maybe I've glossed over something in the details. Please correct if I'm
> >> >> > wrong. Is there a case where evict something might fail with ENOSPC, and
> >> >> > evict everything in a VM would help?
> >> >>
> >> >> Yes, when we've terminally fragmented the gtt we kick out everything and
> >> >> start over. That's the 3rd usecase.
> >> >
> >> > I am not seeing it. To me evict_something is what you want, and the fix
> >> > for wherever the 3rd usecase is (please point it out, I'm dense) is it
> >> > should call evict_something, not evict_everything.
> >>
> >> See the call to evict_everything in
> >> i915_gem_execbuffer.c:i915_gem_execbuffer_reserve
> >>
> >
> > As I was saying in the first response - you only hit this if
> > evict_something() for a vm fails, right? That's the way ret == ENOSPC
> > AFAICT.
> 
> Like I've said if we can't fit a batch we do a last ditch effort of
> evicting everything and starting over anew. That's also what the retry
> logic in there is for. This happens after evict_something failed.
> Dunno what exactly isn't clear or what's confusing ...
> -Daniel

Okay, sorted this out on IRC. You'll get a new patch as described with a
new function for per vm eviction (which will just idle, and call
evict_something() with proper args)

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 20/29] drm/i915: Fix up map and fenceable for VMA
  2013-08-06 19:11   ` Daniel Vetter
@ 2013-08-07 18:37     ` Ben Widawsky
  2013-08-07 20:32       ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-07 18:37 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 09:11:54PM +0200, Daniel Vetter wrote:
> On Wed, Jul 31, 2013 at 05:00:13PM -0700, Ben Widawsky wrote:
> > formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
> > tracking"
> > 
> > The map_and_fenceable tracking is per object. GTT mapping, and fences
> > only apply to global GTT. As such,  object operations which are not
> > performed on the global GTT should not effect mappable or fenceable
> > characteristics.
> > 
> > Functionally, this commit could very well be squashed in to a previous
> > patch which updated object operations to take a VM argument.  This
> > commit is split out because it's a bit tricky (or at least it was for
> > me).
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index d4d6444..ec23a5c 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2626,7 +2626,7 @@ int i915_vma_unbind(struct i915_vma *vma)
> >  
> >  	trace_i915_vma_unbind(vma);
> >  
> > -	if (obj->has_global_gtt_mapping)
> > +	if (obj->has_global_gtt_mapping && i915_is_ggtt(vma->vm))
> >  		i915_gem_gtt_unbind_object(obj);
> >  	if (obj->has_aliasing_ppgtt_mapping) {
> >  		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> 
> Hm, shouldn't we do the is_ggtt check for both? After all only the global
> ggtt can be aliased ever ... This would also be more symmetric with some
> of the other global gtt checks I've spotted. You're take or will that run
> afoul of your Great Plan?
> -Daniel
> 

You're right. The check makes sense for both cases. In both the original
series, and n a few patches, this code turns into:
vma->vm->unbind_vma(vma);

This ugliness is a result of bad rebasing on my part.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 20/29] drm/i915: Fix up map and fenceable for VMA
  2013-08-07 18:37     ` Ben Widawsky
@ 2013-08-07 20:32       ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-07 20:32 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Aug 07, 2013 at 11:37:04AM -0700, Ben Widawsky wrote:
> On Tue, Aug 06, 2013 at 09:11:54PM +0200, Daniel Vetter wrote:
> > On Wed, Jul 31, 2013 at 05:00:13PM -0700, Ben Widawsky wrote:
> > > formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
> > > tracking"
> > > 
> > > The map_and_fenceable tracking is per object. GTT mapping, and fences
> > > only apply to global GTT. As such,  object operations which are not
> > > performed on the global GTT should not effect mappable or fenceable
> > > characteristics.
> > > 
> > > Functionally, this commit could very well be squashed in to a previous
> > > patch which updated object operations to take a VM argument.  This
> > > commit is split out because it's a bit tricky (or at least it was for
> > > me).
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
> > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index d4d6444..ec23a5c 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -2626,7 +2626,7 @@ int i915_vma_unbind(struct i915_vma *vma)
> > >  
> > >  	trace_i915_vma_unbind(vma);
> > >  
> > > -	if (obj->has_global_gtt_mapping)
> > > +	if (obj->has_global_gtt_mapping && i915_is_ggtt(vma->vm))
> > >  		i915_gem_gtt_unbind_object(obj);
> > >  	if (obj->has_aliasing_ppgtt_mapping) {
> > >  		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> > 
> > Hm, shouldn't we do the is_ggtt check for both? After all only the global
> > ggtt can be aliased ever ... This would also be more symmetric with some
> > of the other global gtt checks I've spotted. You're take or will that run
> > afoul of your Great Plan?
> > -Daniel
> > 
> 
> You're right. The check makes sense for both cases. In both the original
> series, and n a few patches, this code turns into:
> vma->vm->unbind_vma(vma);

Ok, I've killed it and merged the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 21/29] drm/i915: mm_list is per VMA
  2013-08-07  0:28     ` Ben Widawsky
@ 2013-08-07 20:52       ` Daniel Vetter
  2013-08-08  4:32         ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-07 20:52 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Aug 06, 2013 at 05:28:06PM -0700, Ben Widawsky wrote:
> On Tue, Aug 06, 2013 at 09:38:41PM +0200, Daniel Vetter wrote:
> > On Wed, Jul 31, 2013 at 05:00:14PM -0700, Ben Widawsky wrote:
> > > formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
> > > 
> > > The mm_list is used for the active/inactive LRUs. Since those LRUs are
> > > per address space, the link should be per VMx .
> > > 
> > > Because we'll only ever have 1 VMA before this point, it's not incorrect
> > > to defer this change until this point in the patch series, and doing it
> > > here makes the change much easier to understand.
> > > 
> > > Shamelessly manipulated out of Daniel:
> > > "active/inactive stuff is used by eviction when we run out of address
> > > space, so needs to be per-vma and per-address space. Bound/unbound otoh
> > > is used by the shrinker which only cares about the amount of memory used
> > > and not one bit about in which address space this memory is all used in.
> > > Of course to actual kick out an object we need to unbind it from every
> > > address space, but for that we have the per-object list of vmas."
> > > 
> > > v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
> > > 
> > > v3: Moved earlier in the series
> > > 
> > > v4: Add dropped message from v3
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > Some comments below for this one. The lru changes look a bit strange so
> > I'll wait for your confirmation that the do_switch hunk has the same
> > reasons s the one in execbuf with the FIXME comment.
> > -Daniel
> > 
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c        | 53 ++++++++++++++++++++----------
> > >  drivers/gpu/drm/i915/i915_drv.h            |  5 +--
> > >  drivers/gpu/drm/i915/i915_gem.c            | 37 +++++++++++----------
> > >  drivers/gpu/drm/i915/i915_gem_context.c    |  3 ++
> > >  drivers/gpu/drm/i915/i915_gem_evict.c      | 14 ++++----
> > >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 ++
> > >  drivers/gpu/drm/i915/i915_gem_stolen.c     |  2 +-
> > >  drivers/gpu/drm/i915/i915_gpu_error.c      | 37 ++++++++++++---------
> > >  8 files changed, 91 insertions(+), 62 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > index 6d5ca85bd..181e5a6 100644
> > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > @@ -149,7 +149,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > >  	struct drm_device *dev = node->minor->dev;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > -	struct drm_i915_gem_object *obj;
> > > +	struct i915_vma *vma;
> > >  	size_t total_obj_size, total_gtt_size;
> > >  	int count, ret;
> > >  
> > > @@ -157,6 +157,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > +	/* FIXME: the user of this interface might want more than just GGTT */
> > >  	switch (list) {
> > >  	case ACTIVE_LIST:
> > >  		seq_puts(m, "Active:\n");
> > > @@ -172,12 +173,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > >  	}
> > >  
> > >  	total_obj_size = total_gtt_size = count = 0;
> > > -	list_for_each_entry(obj, head, mm_list) {
> > > -		seq_puts(m, "   ");
> > > -		describe_obj(m, obj);
> > > -		seq_putc(m, '\n');
> > > -		total_obj_size += obj->base.size;
> > > -		total_gtt_size += i915_gem_obj_ggtt_size(obj);
> > > +	list_for_each_entry(vma, head, mm_list) {
> > > +		seq_printf(m, "   ");
> > > +		describe_obj(m, vma->obj);
> > > +		seq_printf(m, "\n");
> > > +		total_obj_size += vma->obj->base.size;
> > > +		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
> > 
> > Why not use vma->node.size? If you don't disagree I'll bikeshed this while
> > applying.
> > 
> 
> I think in terms of the diff, it's more logical to do it how I did. The
> result should damn well be the same though, so go right ahead. When I
> set about writing the series, I really didn't want to use
> node.size/start directly as much as possible - so we can sneak things
> into the helpers as needed.

I've applied this bikeshed, but the patch required some wiggling in and
conflict resolution. I've checked with your branch and that seems to be
due to the removel of the inactive list walking to adjust the gpu domains
in i915_gem_reset. Please check that I didn't botch the patch rebasing
with your tree.
-Daniel

> 
> 
> > >  		count++;
> > >  	}
> > >  	mutex_unlock(&dev->struct_mutex);
> > > @@ -224,7 +225,18 @@ static int per_file_stats(int id, void *ptr, void *data)
> > >  	return 0;
> > >  }
> > >  
> > > -static int i915_gem_object_info(struct seq_file *m, void *data)
> > > +#define count_vmas(list, member) do { \
> > > +	list_for_each_entry(vma, list, member) { \
> > > +		size += i915_gem_obj_ggtt_size(vma->obj); \
> > > +		++count; \
> > > +		if (vma->obj->map_and_fenceable) { \
> > > +			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
> > > +			++mappable_count; \
> > > +		} \
> > > +	} \
> > > +} while (0)
> > > +
> > > +static int i915_gem_object_info(struct seq_file *m, void* data)
> > >  {
> > >  	struct drm_info_node *node = (struct drm_info_node *) m->private;
> > >  	struct drm_device *dev = node->minor->dev;
> > > @@ -234,6 +246,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> > >  	struct drm_i915_gem_object *obj;
> > >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	struct drm_file *file;
> > > +	struct i915_vma *vma;
> > >  	int ret;
> > >  
> > >  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> > > @@ -253,12 +266,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> > >  		   count, mappable_count, size, mappable_size);
> > >  
> > >  	size = count = mappable_size = mappable_count = 0;
> > > -	count_objects(&vm->active_list, mm_list);
> > > +	count_vmas(&vm->active_list, mm_list);
> > >  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
> > >  		   count, mappable_count, size, mappable_size);
> > >  
> > >  	size = count = mappable_size = mappable_count = 0;
> > > -	count_objects(&vm->inactive_list, mm_list);
> > > +	count_vmas(&vm->inactive_list, mm_list);
> > >  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
> > >  		   count, mappable_count, size, mappable_size);
> > >  
> > > @@ -1771,7 +1784,8 @@ i915_drop_caches_set(void *data, u64 val)
> > >  	struct drm_device *dev = data;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_object *obj, *next;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > > +	struct i915_vma *vma, *x;
> > >  	int ret;
> > >  
> > >  	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
> > > @@ -1792,13 +1806,16 @@ i915_drop_caches_set(void *data, u64 val)
> > >  		i915_gem_retire_requests(dev);
> > >  
> > >  	if (val & DROP_BOUND) {
> > > -		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > > -					 mm_list) {
> > > -			if (obj->pin_count)
> > > -				continue;
> > > -			ret = i915_gem_object_ggtt_unbind(obj);
> > > -			if (ret)
> > > -				goto unlock;
> > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +			list_for_each_entry_safe(vma, x, &vm->inactive_list,
> > > +						 mm_list) {
> > 
> > Imo the double-loop is a bit funny, looping over the global bound list
> > and skipping all active objects is imo the more straightfoward logic. But
> > I agree that this is the more straightforward conversion, so I'm ok with a
> > follow-up fixup patch.
> > 
> 
> I guess we have a lot of such conversions. I don't really mind the
> change, just a bit worried that it's less tested than what I've already
> done. I'm also not yet convinced the result will be a huge improvement
> for readability, but I've been starting at these lists for so long, my
> opinion is quite biased.
> 
> I guess we'll have to see. I've made a note to myself to look into
> converting all these types of loops over, but as we should see little to
> no functional impact from the change, I'd like to hold off until we get
> the rest merged.
> 
> > > +				if (vma->obj->pin_count)
> > > +					continue;
> > > +
> > > +				ret = i915_vma_unbind(vma);
> > > +				if (ret)
> > > +					goto unlock;
> > > +			}
> > >  		}
> > >  	}
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index bf1ecef..220699b 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -546,6 +546,9 @@ struct i915_vma {
> > >  	struct drm_i915_gem_object *obj;
> > >  	struct i915_address_space *vm;
> > >  
> > > +	/** This object's place on the active/inactive lists */
> > > +	struct list_head mm_list;
> > > +
> > >  	struct list_head vma_link; /* Link in the object's VMA list */
> > >  };
> > >  
> > > @@ -1263,9 +1266,7 @@ struct drm_i915_gem_object {
> > >  	struct drm_mm_node *stolen;
> > >  	struct list_head global_list;
> > >  
> > > -	/** This object's place on the active/inactive lists */
> > >  	struct list_head ring_list;
> > > -	struct list_head mm_list;
> > >  	/** This object's place in the batchbuffer or on the eviction list */
> > >  	struct list_head exec_list;
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index ec23a5c..fb3f02f 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -1872,7 +1872,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	u32 seqno = intel_ring_get_seqno(ring);
> > >  
> > >  	BUG_ON(ring == NULL);
> > > @@ -1888,8 +1887,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > >  		obj->active = 1;
> > >  	}
> > >  
> > > -	/* Move from whatever list we were on to the tail of execution. */
> > > -	list_move_tail(&obj->mm_list, &vm->active_list);
> > >  	list_move_tail(&obj->ring_list, &ring->active_list);
> > >  
> > >  	obj->last_read_seqno = seqno;
> > > @@ -1911,14 +1908,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > >  static void
> > >  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > >  {
> > > -	struct drm_device *dev = obj->base.dev;
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > >  
> > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > >  	BUG_ON(!obj->active);
> > >  
> > > -	list_move_tail(&obj->mm_list, &vm->inactive_list);
> > > +	list_move_tail(&vma->mm_list, &ggtt_vm->inactive_list);
> > >  
> > >  	list_del_init(&obj->ring_list);
> > >  	obj->ring = NULL;
> > > @@ -2286,9 +2283,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > >  void i915_gem_reset(struct drm_device *dev)
> > >  {
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm;
> > > -	struct drm_i915_gem_object *obj;
> > >  	struct intel_ring_buffer *ring;
> > > +	struct i915_address_space *vm;
> > > +	struct i915_vma *vma;
> > >  	int i;
> > >  
> > >  	for_each_ring(ring, dev_priv, i)
> > > @@ -2298,8 +2295,8 @@ void i915_gem_reset(struct drm_device *dev)
> > >  	 * necessary invalidation upon reuse.
> > >  	 */
> > >  	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > -		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > -			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > +		list_for_each_entry(vma, &vm->inactive_list, mm_list)
> > > +			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > >  
> > >  	i915_gem_restore_fences(dev);
> > >  }
> > > @@ -2353,6 +2350,7 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> > >  			break;
> > >  
> > > +		BUG_ON(!obj->active);
> > >  		i915_gem_object_move_to_inactive(obj);
> > >  	}
> > >  
> > > @@ -2635,7 +2633,6 @@ int i915_vma_unbind(struct i915_vma *vma)
> > >  	i915_gem_gtt_finish_object(obj);
> > >  	i915_gem_object_unpin_pages(obj);
> > >  
> > > -	list_del(&obj->mm_list);
> > >  	/* Avoid an unnecessary call to unbind on rebind. */
> > >  	if (i915_is_ggtt(vma->vm))
> > >  		obj->map_and_fenceable = true;
> > > @@ -3180,7 +3177,7 @@ search_free:
> > >  		goto err_out;
> > >  
> > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > > +	list_add_tail(&vma->mm_list, &vm->inactive_list);
> > >  
> > >  	/* Keep GGTT vmas first to make debug easier */
> > >  	if (i915_is_ggtt(vm))
> > > @@ -3342,9 +3339,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > >  					    old_write_domain);
> > >  
> > >  	/* And bump the LRU for this access */
> > > -	if (i915_gem_object_is_inactive(obj))
> > > -		list_move_tail(&obj->mm_list,
> > > -			       &dev_priv->gtt.base.inactive_list);
> > > +	if (i915_gem_object_is_inactive(obj)) {
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> > > +							   &dev_priv->gtt.base);
> > > +		if (vma)
> > > +			list_move_tail(&vma->mm_list,
> > > +				       &dev_priv->gtt.base.inactive_list);
> > > +
> > > +	}
> > >  
> > >  	return 0;
> > >  }
> > > @@ -3917,7 +3919,6 @@ unlock:
> > >  void i915_gem_object_init(struct drm_i915_gem_object *obj,
> > >  			  const struct drm_i915_gem_object_ops *ops)
> > >  {
> > > -	INIT_LIST_HEAD(&obj->mm_list);
> > >  	INIT_LIST_HEAD(&obj->global_list);
> > >  	INIT_LIST_HEAD(&obj->ring_list);
> > >  	INIT_LIST_HEAD(&obj->exec_list);
> > > @@ -4054,6 +4055,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > >  		return ERR_PTR(-ENOMEM);
> > >  
> > >  	INIT_LIST_HEAD(&vma->vma_link);
> > > +	INIT_LIST_HEAD(&vma->mm_list);
> > >  	vma->vm = vm;
> > >  	vma->obj = obj;
> > >  
> > > @@ -4063,6 +4065,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > >  void i915_gem_vma_destroy(struct i915_vma *vma)
> > >  {
> > >  	list_del_init(&vma->vma_link);
> > > +	list_del(&vma->mm_list);
> > >  	drm_mm_remove_node(&vma->node);
> > >  	kfree(vma);
> > >  }
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index d1cb28c..88b0f52 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -436,7 +436,10 @@ static int do_switch(struct i915_hw_context *to)
> > >  	 * MI_SET_CONTEXT instead of when the next seqno has completed.
> > >  	 */
> > >  	if (from != NULL) {
> > > +		struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
> > > +		struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > +		list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
> > 
> > I don't really see a reason to add this here ... shouldn't move_to_active
> > take care of this? Obviously not in this patch here but later on when it's
> > converted over.
> 
> Yes. You're right - it's sort of an ugly intermediate artifact.
> 
> > 
> > >  		i915_gem_object_move_to_active(from->obj, ring);
> > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > index 61bf5e2..425939b 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > @@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > >  
> > >  	/* First see if there is a large enough contiguous idle region... */
> > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
> > >  		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > > @@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > >  		goto none;
> > >  
> > >  	/* Now merge in the soon-to-be-expired objects... */
> > > -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +	list_for_each_entry(vma, &vm->active_list, mm_list) {
> > >  		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > > @@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > >  	struct i915_address_space *vm;
> > > -	struct drm_i915_gem_object *obj, *next;
> > > +	struct i915_vma *vma, *next;
> > >  	bool lists_empty = true;
> > >  	int ret;
> > >  
> > > @@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  
> > >  	/* Having flushed everything, unbind() should never raise an error */
> > >  	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > -		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > -			if (obj->pin_count == 0)
> > > -				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
> > > +		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> > > +			if (vma->obj->pin_count == 0)
> > > +				WARN_ON(i915_vma_unbind(vma));
> > >  	}
> > >  
> > >  	return 0;
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index 64dc6b5..0f21702 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -801,6 +801,8 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > >  
> > > +		/* FIXME: This lookup gets fixed later <-- danvet */
> > > +		list_move_tail(&i915_gem_obj_to_vma(obj, vm)->mm_list, &vm->active_list);
> > 
> > Ah, I guess the same comment applies to the lru frobbing in do_switch?
> > 
> > >  		i915_gem_object_move_to_active(obj, ring);
> > >  		if (obj->base.write_domain) {
> > >  			obj->dirty = 1;
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > index 000ffbd..fa60103 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	obj->has_global_gtt_mapping = 1;
> > >  
> > >  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > -	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
> > > +	list_add_tail(&vma->mm_list, &ggtt->inactive_list);
> > >  
> > >  	return obj;
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > index d970d84..9623a4e 100644
> > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > @@ -556,11 +556,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
> > >  static u32 capture_active_bo(struct drm_i915_error_buffer *err,
> > >  			     int count, struct list_head *head)
> > >  {
> > > -	struct drm_i915_gem_object *obj;
> > > +	struct i915_vma *vma;
> > >  	int i = 0;
> > >  
> > > -	list_for_each_entry(obj, head, mm_list) {
> > > -		capture_bo(err++, obj);
> > > +	list_for_each_entry(vma, head, mm_list) {
> > > +		capture_bo(err++, vma->obj);
> > >  		if (++i == count)
> > >  			break;
> > >  	}
> > > @@ -622,7 +622,8 @@ static struct drm_i915_error_object *
> > >  i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > >  			     struct intel_ring_buffer *ring)
> > >  {
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > > +	struct i915_vma *vma;
> > >  	struct drm_i915_gem_object *obj;
> > >  	u32 seqno;
> > >  
> > > @@ -642,20 +643,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > >  	}
> > >  
> > >  	seqno = ring->get_seqno(ring, false);
> > > -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > -		if (obj->ring != ring)
> > > -			continue;
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		list_for_each_entry(vma, &vm->active_list, mm_list) {
> > 
> > We could instead loop over the bound list and check for ->active. But this
> > is ok, too albeit a bit convoluted imo.
> > 
> > > +			obj = vma->obj;
> > > +			if (obj->ring != ring)
> > > +				continue;
> > >  
> > > -		if (i915_seqno_passed(seqno, obj->last_read_seqno))
> > > -			continue;
> > > +			if (i915_seqno_passed(seqno, obj->last_read_seqno))
> > > +				continue;
> > >  
> > > -		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> > > -			continue;
> > > +			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> > > +				continue;
> > >  
> > > -		/* We need to copy these to an anonymous buffer as the simplest
> > > -		 * method to avoid being overwritten by userspace.
> > > -		 */
> > > -		return i915_error_object_create(dev_priv, obj);
> > > +			/* We need to copy these to an anonymous buffer as the simplest
> > > +			 * method to avoid being overwritten by userspace.
> > > +			 */
> > > +			return i915_error_object_create(dev_priv, obj);
> > > +		}
> > >  	}
> > >  
> > >  	return NULL;
> > > @@ -775,11 +779,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> > >  				     struct drm_i915_error_state *error)
> > >  {
> > >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_vma *vma;
> > >  	struct drm_i915_gem_object *obj;
> > >  	int i;
> > >  
> > >  	i = 0;
> > > -	list_for_each_entry(obj, &vm->active_list, mm_list)
> > > +	list_for_each_entry(vma, &vm->active_list, mm_list)
> > >  		i++;
> > >  	error->active_bo_count = i;
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 24/29] drm/i915: create vmas at execbuf
  2013-08-01  0:00 ` [PATCH 24/29] drm/i915: create vmas at execbuf Ben Widawsky
@ 2013-08-07 20:52   ` Daniel Vetter
  0 siblings, 0 replies; 70+ messages in thread
From: Daniel Vetter @ 2013-08-07 20:52 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Wed, Jul 31, 2013 at 05:00:17PM -0700, Ben Widawsky wrote:
> In order to transition more of our code over to using a VMA instead of
> an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up
> until now, we've only had a VMA when actually binding an object.
> 
> The previous patch helped handle the distinction on bound vs. unbound.
> This patch will help us catch leaks, and other issues before we actually
> shuffle a bunch of stuff around.
> 
> The subsequent patch to fix up the rest of execbuf should be mostly just
> moving code around, and this is the major functional change.
> 
> v2: Release table_lock earlier so vma allocation needn't be atomic.
> (Chris)
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Merged up to this patch to dinq, thanks.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  3 +++
>  drivers/gpu/drm/i915/i915_gem.c            | 25 ++++++++++++++++++-------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 18 +++++++++++++-----
>  3 files changed, 34 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f6c2812..c0eb7fd 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1857,6 +1857,9 @@ unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
>  				struct i915_address_space *vm);
>  struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
>  				     struct i915_address_space *vm);
> +struct i915_vma *
> +i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
> +				  struct i915_address_space *vm);
>  /* Some GGTT VM helpers */
>  #define obj_to_ggtt(obj) \
>  	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 21331d8..72bd53c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3101,8 +3101,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  	struct i915_vma *vma;
>  	int ret;
>  
> -	if (WARN_ON(!list_empty(&obj->vma_list)))
> -		return -EBUSY;
> +	BUG_ON(!i915_is_ggtt(vm));
>  
>  	fence_size = i915_gem_get_gtt_size(dev,
>  					   obj->base.size,
> @@ -3142,16 +3141,15 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  
>  	i915_gem_object_pin_pages(obj);
>  
> -	/* FIXME: For now we only ever use 1 VMA per object */
> -	BUG_ON(!i915_is_ggtt(vm));
> -	WARN_ON(!list_empty(&obj->vma_list));
> -
> -	vma = i915_gem_vma_create(obj, vm);
> +	vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
>  	if (IS_ERR(vma)) {
>  		i915_gem_object_unpin_pages(obj);
>  		return PTR_ERR(vma);
>  	}
>  
> +	/* For now we only ever use 1 vma per object */
> +	WARN_ON(!list_is_singular(&obj->vma_list));
> +
>  search_free:
>  	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>  						  size, alignment,
> @@ -4800,3 +4798,16 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
>  
>  	return NULL;
>  }
> +
> +struct i915_vma *
> +i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
> +				  struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +
> +	vma = i915_gem_obj_to_vma(obj, vm);
> +	if (!vma)
> +		vma = i915_gem_vma_create(obj, vm);
> +
> +	return vma;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 0f21702..3f17a55 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -85,14 +85,14 @@ static int
>  eb_lookup_objects(struct eb_objects *eb,
>  		  struct drm_i915_gem_exec_object2 *exec,
>  		  const struct drm_i915_gem_execbuffer2 *args,
> +		  struct i915_address_space *vm,
>  		  struct drm_file *file)
>  {
> +	struct drm_i915_gem_object *obj;
>  	int i;
>  
>  	spin_lock(&file->table_lock);
>  	for (i = 0; i < args->buffer_count; i++) {
> -		struct drm_i915_gem_object *obj;
> -
>  		obj = to_intel_bo(idr_find(&file->object_idr, exec[i].handle));
>  		if (obj == NULL) {
>  			spin_unlock(&file->table_lock);
> @@ -110,6 +110,15 @@ eb_lookup_objects(struct eb_objects *eb,
>  
>  		drm_gem_object_reference(&obj->base);
>  		list_add_tail(&obj->exec_list, &eb->objects);
> +	}
> +	spin_unlock(&file->table_lock);
> +
> +	list_for_each_entry(obj,  &eb->objects, exec_list) {
> +		struct i915_vma *vma;
> +
> +		vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
> +		if (IS_ERR(vma))
> +			return PTR_ERR(vma);
>  
>  		obj->exec_entry = &exec[i];
>  		if (eb->and < 0) {
> @@ -121,7 +130,6 @@ eb_lookup_objects(struct eb_objects *eb,
>  				       &eb->buckets[handle & eb->and]);
>  		}
>  	}
> -	spin_unlock(&file->table_lock);
>  
>  	return 0;
>  }
> @@ -672,7 +680,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  
>  	/* reacquire the objects */
>  	eb_reset(eb);
> -	ret = eb_lookup_objects(eb, exec, args, file);
> +	ret = eb_lookup_objects(eb, exec, args, vm, file);
>  	if (ret)
>  		goto err;
>  
> @@ -1009,7 +1017,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	}
>  
>  	/* Look up object handles */
> -	ret = eb_lookup_objects(eb, exec, args, file);
> +	ret = eb_lookup_objects(eb, exec, args, vm, file);
>  	if (ret)
>  		goto err;
>  
> -- 
> 1.8.3.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 21/29] drm/i915: mm_list is per VMA
  2013-08-07 20:52       ` Daniel Vetter
@ 2013-08-08  4:32         ` Ben Widawsky
  2013-08-08  6:46           ` Daniel Vetter
  0 siblings, 1 reply; 70+ messages in thread
From: Ben Widawsky @ 2013-08-08  4:32 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Wed, Aug 07, 2013 at 10:52:14PM +0200, Daniel Vetter wrote:
> On Tue, Aug 06, 2013 at 05:28:06PM -0700, Ben Widawsky wrote:
> > On Tue, Aug 06, 2013 at 09:38:41PM +0200, Daniel Vetter wrote:
> > > On Wed, Jul 31, 2013 at 05:00:14PM -0700, Ben Widawsky wrote:
> > > > formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
> > > > 
> > > > The mm_list is used for the active/inactive LRUs. Since those LRUs are
> > > > per address space, the link should be per VMx .
> > > > 
> > > > Because we'll only ever have 1 VMA before this point, it's not incorrect
> > > > to defer this change until this point in the patch series, and doing it
> > > > here makes the change much easier to understand.
> > > > 
> > > > Shamelessly manipulated out of Daniel:
> > > > "active/inactive stuff is used by eviction when we run out of address
> > > > space, so needs to be per-vma and per-address space. Bound/unbound otoh
> > > > is used by the shrinker which only cares about the amount of memory used
> > > > and not one bit about in which address space this memory is all used in.
> > > > Of course to actual kick out an object we need to unbind it from every
> > > > address space, but for that we have the per-object list of vmas."
> > > > 
> > > > v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
> > > > 
> > > > v3: Moved earlier in the series
> > > > 
> > > > v4: Add dropped message from v3
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > Some comments below for this one. The lru changes look a bit strange so
> > > I'll wait for your confirmation that the do_switch hunk has the same
> > > reasons s the one in execbuf with the FIXME comment.
> > > -Daniel
> > > 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_debugfs.c        | 53 ++++++++++++++++++++----------
> > > >  drivers/gpu/drm/i915/i915_drv.h            |  5 +--
> > > >  drivers/gpu/drm/i915/i915_gem.c            | 37 +++++++++++----------
> > > >  drivers/gpu/drm/i915/i915_gem_context.c    |  3 ++
> > > >  drivers/gpu/drm/i915/i915_gem_evict.c      | 14 ++++----
> > > >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 ++
> > > >  drivers/gpu/drm/i915/i915_gem_stolen.c     |  2 +-
> > > >  drivers/gpu/drm/i915/i915_gpu_error.c      | 37 ++++++++++++---------
> > > >  8 files changed, 91 insertions(+), 62 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > index 6d5ca85bd..181e5a6 100644
> > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > @@ -149,7 +149,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > > >  	struct drm_device *dev = node->minor->dev;
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > -	struct drm_i915_gem_object *obj;
> > > > +	struct i915_vma *vma;
> > > >  	size_t total_obj_size, total_gtt_size;
> > > >  	int count, ret;
> > > >  
> > > > @@ -157,6 +157,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > +	/* FIXME: the user of this interface might want more than just GGTT */
> > > >  	switch (list) {
> > > >  	case ACTIVE_LIST:
> > > >  		seq_puts(m, "Active:\n");
> > > > @@ -172,12 +173,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > > >  	}
> > > >  
> > > >  	total_obj_size = total_gtt_size = count = 0;
> > > > -	list_for_each_entry(obj, head, mm_list) {
> > > > -		seq_puts(m, "   ");
> > > > -		describe_obj(m, obj);
> > > > -		seq_putc(m, '\n');
> > > > -		total_obj_size += obj->base.size;
> > > > -		total_gtt_size += i915_gem_obj_ggtt_size(obj);
> > > > +	list_for_each_entry(vma, head, mm_list) {
> > > > +		seq_printf(m, "   ");
> > > > +		describe_obj(m, vma->obj);
> > > > +		seq_printf(m, "\n");
> > > > +		total_obj_size += vma->obj->base.size;
> > > > +		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
> > > 
> > > Why not use vma->node.size? If you don't disagree I'll bikeshed this while
> > > applying.
> > > 
> > 
> > I think in terms of the diff, it's more logical to do it how I did. The
> > result should damn well be the same though, so go right ahead. When I
> > set about writing the series, I really didn't want to use
> > node.size/start directly as much as possible - so we can sneak things
> > into the helpers as needed.
> 
> I've applied this bikeshed, but the patch required some wiggling in and
> conflict resolution. I've checked with your branch and that seems to be
> due to the removel of the inactive list walking to adjust the gpu domains
> in i915_gem_reset. Please check that I didn't botch the patch rebasing
> with your tree.
> -Daniel
> 

You killed a BUG in i915_gem_retire_requests_ring, shouldn't that be a WARN or are you in the business of completely killing assertions now :p?

Otherwise, it looks good to me. There are enough diffs because of some
other patches you merged (like watermarks) - that I may have well missed
something in the noise; ie. no promises.

> > 
> > 
> > > >  		count++;
> > > >  	}
> > > >  	mutex_unlock(&dev->struct_mutex);
> > > > @@ -224,7 +225,18 @@ static int per_file_stats(int id, void *ptr, void *data)
> > > >  	return 0;
> > > >  }
> > > >  
> > > > -static int i915_gem_object_info(struct seq_file *m, void *data)
> > > > +#define count_vmas(list, member) do { \
> > > > +	list_for_each_entry(vma, list, member) { \
> > > > +		size += i915_gem_obj_ggtt_size(vma->obj); \
> > > > +		++count; \
> > > > +		if (vma->obj->map_and_fenceable) { \
> > > > +			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
> > > > +			++mappable_count; \
> > > > +		} \
> > > > +	} \
> > > > +} while (0)
> > > > +
> > > > +static int i915_gem_object_info(struct seq_file *m, void* data)
> > > >  {
> > > >  	struct drm_info_node *node = (struct drm_info_node *) m->private;
> > > >  	struct drm_device *dev = node->minor->dev;
> > > > @@ -234,6 +246,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	struct drm_file *file;
> > > > +	struct i915_vma *vma;
> > > >  	int ret;
> > > >  
> > > >  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> > > > @@ -253,12 +266,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
> > > >  		   count, mappable_count, size, mappable_size);
> > > >  
> > > >  	size = count = mappable_size = mappable_count = 0;
> > > > -	count_objects(&vm->active_list, mm_list);
> > > > +	count_vmas(&vm->active_list, mm_list);
> > > >  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
> > > >  		   count, mappable_count, size, mappable_size);
> > > >  
> > > >  	size = count = mappable_size = mappable_count = 0;
> > > > -	count_objects(&vm->inactive_list, mm_list);
> > > > +	count_vmas(&vm->inactive_list, mm_list);
> > > >  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
> > > >  		   count, mappable_count, size, mappable_size);
> > > >  
> > > > @@ -1771,7 +1784,8 @@ i915_drop_caches_set(void *data, u64 val)
> > > >  	struct drm_device *dev = data;
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_object *obj, *next;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > > +	struct i915_vma *vma, *x;
> > > >  	int ret;
> > > >  
> > > >  	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
> > > > @@ -1792,13 +1806,16 @@ i915_drop_caches_set(void *data, u64 val)
> > > >  		i915_gem_retire_requests(dev);
> > > >  
> > > >  	if (val & DROP_BOUND) {
> > > > -		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > > > -					 mm_list) {
> > > > -			if (obj->pin_count)
> > > > -				continue;
> > > > -			ret = i915_gem_object_ggtt_unbind(obj);
> > > > -			if (ret)
> > > > -				goto unlock;
> > > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +			list_for_each_entry_safe(vma, x, &vm->inactive_list,
> > > > +						 mm_list) {
> > > 
> > > Imo the double-loop is a bit funny, looping over the global bound list
> > > and skipping all active objects is imo the more straightfoward logic. But
> > > I agree that this is the more straightforward conversion, so I'm ok with a
> > > follow-up fixup patch.
> > > 
> > 
> > I guess we have a lot of such conversions. I don't really mind the
> > change, just a bit worried that it's less tested than what I've already
> > done. I'm also not yet convinced the result will be a huge improvement
> > for readability, but I've been starting at these lists for so long, my
> > opinion is quite biased.
> > 
> > I guess we'll have to see. I've made a note to myself to look into
> > converting all these types of loops over, but as we should see little to
> > no functional impact from the change, I'd like to hold off until we get
> > the rest merged.
> > 
> > > > +				if (vma->obj->pin_count)
> > > > +					continue;
> > > > +
> > > > +				ret = i915_vma_unbind(vma);
> > > > +				if (ret)
> > > > +					goto unlock;
> > > > +			}
> > > >  		}
> > > >  	}
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > index bf1ecef..220699b 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -546,6 +546,9 @@ struct i915_vma {
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	struct i915_address_space *vm;
> > > >  
> > > > +	/** This object's place on the active/inactive lists */
> > > > +	struct list_head mm_list;
> > > > +
> > > >  	struct list_head vma_link; /* Link in the object's VMA list */
> > > >  };
> > > >  
> > > > @@ -1263,9 +1266,7 @@ struct drm_i915_gem_object {
> > > >  	struct drm_mm_node *stolen;
> > > >  	struct list_head global_list;
> > > >  
> > > > -	/** This object's place on the active/inactive lists */
> > > >  	struct list_head ring_list;
> > > > -	struct list_head mm_list;
> > > >  	/** This object's place in the batchbuffer or on the eviction list */
> > > >  	struct list_head exec_list;
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index ec23a5c..fb3f02f 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -1872,7 +1872,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	u32 seqno = intel_ring_get_seqno(ring);
> > > >  
> > > >  	BUG_ON(ring == NULL);
> > > > @@ -1888,8 +1887,6 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > >  		obj->active = 1;
> > > >  	}
> > > >  
> > > > -	/* Move from whatever list we were on to the tail of execution. */
> > > > -	list_move_tail(&obj->mm_list, &vm->active_list);
> > > >  	list_move_tail(&obj->ring_list, &ring->active_list);
> > > >  
> > > >  	obj->last_read_seqno = seqno;
> > > > @@ -1911,14 +1908,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > >  static void
> > > >  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	struct drm_device *dev = obj->base.dev;
> > > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > > > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > > >  
> > > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > > >  	BUG_ON(!obj->active);
> > > >  
> > > > -	list_move_tail(&obj->mm_list, &vm->inactive_list);
> > > > +	list_move_tail(&vma->mm_list, &ggtt_vm->inactive_list);
> > > >  
> > > >  	list_del_init(&obj->ring_list);
> > > >  	obj->ring = NULL;
> > > > @@ -2286,9 +2283,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > > >  void i915_gem_reset(struct drm_device *dev)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm;
> > > > -	struct drm_i915_gem_object *obj;
> > > >  	struct intel_ring_buffer *ring;
> > > > +	struct i915_address_space *vm;
> > > > +	struct i915_vma *vma;
> > > >  	int i;
> > > >  
> > > >  	for_each_ring(ring, dev_priv, i)
> > > > @@ -2298,8 +2295,8 @@ void i915_gem_reset(struct drm_device *dev)
> > > >  	 * necessary invalidation upon reuse.
> > > >  	 */
> > > >  	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > -		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > -			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > > +		list_for_each_entry(vma, &vm->inactive_list, mm_list)
> > > > +			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > >  
> > > >  	i915_gem_restore_fences(dev);
> > > >  }
> > > > @@ -2353,6 +2350,7 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > > >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> > > >  			break;
> > > >  
> > > > +		BUG_ON(!obj->active);
> > > >  		i915_gem_object_move_to_inactive(obj);
> > > >  	}
> > > >  
> > > > @@ -2635,7 +2633,6 @@ int i915_vma_unbind(struct i915_vma *vma)
> > > >  	i915_gem_gtt_finish_object(obj);
> > > >  	i915_gem_object_unpin_pages(obj);
> > > >  
> > > > -	list_del(&obj->mm_list);
> > > >  	/* Avoid an unnecessary call to unbind on rebind. */
> > > >  	if (i915_is_ggtt(vma->vm))
> > > >  		obj->map_and_fenceable = true;
> > > > @@ -3180,7 +3177,7 @@ search_free:
> > > >  		goto err_out;
> > > >  
> > > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > > -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > > > +	list_add_tail(&vma->mm_list, &vm->inactive_list);
> > > >  
> > > >  	/* Keep GGTT vmas first to make debug easier */
> > > >  	if (i915_is_ggtt(vm))
> > > > @@ -3342,9 +3339,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > > >  					    old_write_domain);
> > > >  
> > > >  	/* And bump the LRU for this access */
> > > > -	if (i915_gem_object_is_inactive(obj))
> > > > -		list_move_tail(&obj->mm_list,
> > > > -			       &dev_priv->gtt.base.inactive_list);
> > > > +	if (i915_gem_object_is_inactive(obj)) {
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> > > > +							   &dev_priv->gtt.base);
> > > > +		if (vma)
> > > > +			list_move_tail(&vma->mm_list,
> > > > +				       &dev_priv->gtt.base.inactive_list);
> > > > +
> > > > +	}
> > > >  
> > > >  	return 0;
> > > >  }
> > > > @@ -3917,7 +3919,6 @@ unlock:
> > > >  void i915_gem_object_init(struct drm_i915_gem_object *obj,
> > > >  			  const struct drm_i915_gem_object_ops *ops)
> > > >  {
> > > > -	INIT_LIST_HEAD(&obj->mm_list);
> > > >  	INIT_LIST_HEAD(&obj->global_list);
> > > >  	INIT_LIST_HEAD(&obj->ring_list);
> > > >  	INIT_LIST_HEAD(&obj->exec_list);
> > > > @@ -4054,6 +4055,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > > >  		return ERR_PTR(-ENOMEM);
> > > >  
> > > >  	INIT_LIST_HEAD(&vma->vma_link);
> > > > +	INIT_LIST_HEAD(&vma->mm_list);
> > > >  	vma->vm = vm;
> > > >  	vma->obj = obj;
> > > >  
> > > > @@ -4063,6 +4065,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > > >  void i915_gem_vma_destroy(struct i915_vma *vma)
> > > >  {
> > > >  	list_del_init(&vma->vma_link);
> > > > +	list_del(&vma->mm_list);
> > > >  	drm_mm_remove_node(&vma->node);
> > > >  	kfree(vma);
> > > >  }
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > index d1cb28c..88b0f52 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > @@ -436,7 +436,10 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	 * MI_SET_CONTEXT instead of when the next seqno has completed.
> > > >  	 */
> > > >  	if (from != NULL) {
> > > > +		struct drm_i915_private *dev_priv = from->obj->base.dev->dev_private;
> > > > +		struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > > +		list_move_tail(&i915_gem_obj_to_vma(from->obj, ggtt)->mm_list, &ggtt->active_list);
> > > 
> > > I don't really see a reason to add this here ... shouldn't move_to_active
> > > take care of this? Obviously not in this patch here but later on when it's
> > > converted over.
> > 
> > Yes. You're right - it's sort of an ugly intermediate artifact.
> > 
> > > 
> > > >  		i915_gem_object_move_to_active(from->obj, ring);
> > > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > index 61bf5e2..425939b 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > @@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > > >  
> > > >  	/* First see if there is a large enough contiguous idle region... */
> > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > > -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
> > > >  		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > > @@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > >  		goto none;
> > > >  
> > > >  	/* Now merge in the soon-to-be-expired objects... */
> > > > -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > > -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +	list_for_each_entry(vma, &vm->active_list, mm_list) {
> > > >  		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > > @@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > >  	struct i915_address_space *vm;
> > > > -	struct drm_i915_gem_object *obj, *next;
> > > > +	struct i915_vma *vma, *next;
> > > >  	bool lists_empty = true;
> > > >  	int ret;
> > > >  
> > > > @@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  
> > > >  	/* Having flushed everything, unbind() should never raise an error */
> > > >  	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > -		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > -			if (obj->pin_count == 0)
> > > > -				WARN_ON(i915_vma_unbind(i915_gem_obj_to_vma(obj, vm)));
> > > > +		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> > > > +			if (vma->obj->pin_count == 0)
> > > > +				WARN_ON(i915_vma_unbind(vma));
> > > >  	}
> > > >  
> > > >  	return 0;
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > index 64dc6b5..0f21702 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > @@ -801,6 +801,8 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > > >  
> > > > +		/* FIXME: This lookup gets fixed later <-- danvet */
> > > > +		list_move_tail(&i915_gem_obj_to_vma(obj, vm)->mm_list, &vm->active_list);
> > > 
> > > Ah, I guess the same comment applies to the lru frobbing in do_switch?
> > > 
> > > >  		i915_gem_object_move_to_active(obj, ring);
> > > >  		if (obj->base.write_domain) {
> > > >  			obj->dirty = 1;
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > index 000ffbd..fa60103 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	obj->has_global_gtt_mapping = 1;
> > > >  
> > > >  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > > -	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
> > > > +	list_add_tail(&vma->mm_list, &ggtt->inactive_list);
> > > >  
> > > >  	return obj;
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > > index d970d84..9623a4e 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > > @@ -556,11 +556,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
> > > >  static u32 capture_active_bo(struct drm_i915_error_buffer *err,
> > > >  			     int count, struct list_head *head)
> > > >  {
> > > > -	struct drm_i915_gem_object *obj;
> > > > +	struct i915_vma *vma;
> > > >  	int i = 0;
> > > >  
> > > > -	list_for_each_entry(obj, head, mm_list) {
> > > > -		capture_bo(err++, obj);
> > > > +	list_for_each_entry(vma, head, mm_list) {
> > > > +		capture_bo(err++, vma->obj);
> > > >  		if (++i == count)
> > > >  			break;
> > > >  	}
> > > > @@ -622,7 +622,8 @@ static struct drm_i915_error_object *
> > > >  i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > > >  			     struct intel_ring_buffer *ring)
> > > >  {
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > > +	struct i915_vma *vma;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	u32 seqno;
> > > >  
> > > > @@ -642,20 +643,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > > >  	}
> > > >  
> > > >  	seqno = ring->get_seqno(ring, false);
> > > > -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > > -		if (obj->ring != ring)
> > > > -			continue;
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		list_for_each_entry(vma, &vm->active_list, mm_list) {
> > > 
> > > We could instead loop over the bound list and check for ->active. But this
> > > is ok, too albeit a bit convoluted imo.
> > > 
> > > > +			obj = vma->obj;
> > > > +			if (obj->ring != ring)
> > > > +				continue;
> > > >  
> > > > -		if (i915_seqno_passed(seqno, obj->last_read_seqno))
> > > > -			continue;
> > > > +			if (i915_seqno_passed(seqno, obj->last_read_seqno))
> > > > +				continue;
> > > >  
> > > > -		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> > > > -			continue;
> > > > +			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> > > > +				continue;
> > > >  
> > > > -		/* We need to copy these to an anonymous buffer as the simplest
> > > > -		 * method to avoid being overwritten by userspace.
> > > > -		 */
> > > > -		return i915_error_object_create(dev_priv, obj);
> > > > +			/* We need to copy these to an anonymous buffer as the simplest
> > > > +			 * method to avoid being overwritten by userspace.
> > > > +			 */
> > > > +			return i915_error_object_create(dev_priv, obj);
> > > > +		}
> > > >  	}
> > > >  
> > > >  	return NULL;
> > > > @@ -775,11 +779,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> > > >  				     struct drm_i915_error_state *error)
> > > >  {
> > > >  	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_vma *vma;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int i;
> > > >  
> > > >  	i = 0;
> > > > -	list_for_each_entry(obj, &vm->active_list, mm_list)
> > > > +	list_for_each_entry(vma, &vm->active_list, mm_list)
> > > >  		i++;
> > > >  	error->active_bo_count = i;
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> > -- 
> > Ben Widawsky, Intel Open Source Technology Center
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 21/29] drm/i915: mm_list is per VMA
  2013-08-08  4:32         ` Ben Widawsky
@ 2013-08-08  6:46           ` Daniel Vetter
  2013-08-08 18:10             ` Ben Widawsky
  0 siblings, 1 reply; 70+ messages in thread
From: Daniel Vetter @ 2013-08-08  6:46 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Aug 8, 2013 at 6:32 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> You killed a BUG in i915_gem_retire_requests_ring, shouldn't that be a WARN or are you in the business of completely killing assertions now :p?

Yeah, and my little commit message annotation even explained that it's
fully redundant since the move_to_inactive function called on the next
line has the exact same check ;-)

> Otherwise, it looks good to me. There are enough diffs because of some
> other patches you merged (like watermarks) - that I may have well missed
> something in the noise; ie. no promises.

Thanks, though stupid me failed to push out the last patch I've
merged. But that one applied without fuzz, so I think it should be ok.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 21/29] drm/i915: mm_list is per VMA
  2013-08-08  6:46           ` Daniel Vetter
@ 2013-08-08 18:10             ` Ben Widawsky
  0 siblings, 0 replies; 70+ messages in thread
From: Ben Widawsky @ 2013-08-08 18:10 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Thu, Aug 08, 2013 at 08:46:46AM +0200, Daniel Vetter wrote:
> On Thu, Aug 8, 2013 at 6:32 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > You killed a BUG in i915_gem_retire_requests_ring, shouldn't that be a WARN or are you in the business of completely killing assertions now :p?
> 
> Yeah, and my little commit message annotation even explained that it's
> fully redundant since the move_to_inactive function called on the next
> line has the exact same check ;-)

Heh, I realized this as I was dozing off last night... "didn't danvet
point that out already." Oh well, at least I passed the test.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2013-08-08 18:10 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-31 23:59 [PATCH 00/29] Completion of i915 VMAs v2 Ben Widawsky
2013-07-31 23:59 ` [PATCH 01/29] drm/i915: Create an init vm Ben Widawsky
2013-07-31 23:59 ` [PATCH 02/29] drm/i915: Rework drop caches for checkpatch Ben Widawsky
2013-08-03 11:32   ` Chris Wilson
2013-08-03 22:10     ` Ben Widawsky
2013-07-31 23:59 ` [PATCH 03/29] drm/i915: Make proper functions for VMs Ben Widawsky
2013-07-31 23:59 ` [PATCH 04/29] drm/i915: Use bound list for inactive shrink Ben Widawsky
2013-07-31 23:59 ` [PATCH 05/29] drm/i915: Add VM to pin Ben Widawsky
2013-07-31 23:59 ` [PATCH 06/29] drm/i915: Use ggtt_vm to save some typing Ben Widawsky
2013-08-01  0:00 ` [PATCH 07/29] drm/i915: Update describe_obj Ben Widawsky
2013-08-01  0:00 ` [PATCH 08/29] drm/i915: Rework __i915_gem_shrink Ben Widawsky
2013-08-05  8:59   ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 09/29] drm/i915: thread address space through execbuf Ben Widawsky
2013-08-05  9:39   ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 10/29] drm/i915: make caching operate on all address spaces Ben Widawsky
2013-08-05  9:41   ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 11/29] drm/i915: BUG_ON put_pages later Ben Widawsky
2013-08-05  9:42   ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 12/29] drm/i915: make reset&hangcheck code VM aware Ben Widawsky
2013-08-01  0:00 ` [PATCH 13/29] drm/i915: clear domains for all objects on reset Ben Widawsky
2013-08-03 10:59   ` Chris Wilson
2013-08-03 22:24     ` Ben Widawsky
2013-08-05  9:52       ` Daniel Vetter
2013-08-05 16:46   ` [PATCH 13/29] drm/i915: eliminate dead domain clearing " Ben Widawsky
2013-08-05 17:13     ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 14/29] drm/i915: Restore PDEs on gtt restore Ben Widawsky
2013-08-06 18:14   ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 15/29] drm/i915: Improve VMA comments Ben Widawsky
2013-08-01  0:00 ` [PATCH 16/29] drm/i915: Cleanup more of VMA in destroy Ben Widawsky
2013-08-01  0:00 ` [PATCH 17/29] drm/i915: plumb VM into bind/unbind code Ben Widawsky
2013-08-06 18:29   ` Daniel Vetter
2013-08-06 18:54     ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 18/29] drm/i915: Use new bind/unbind in eviction code Ben Widawsky
2013-08-06 18:39   ` Daniel Vetter
2013-08-06 21:27     ` Ben Widawsky
2013-08-06 21:29       ` Daniel Vetter
2013-08-06 22:57         ` Ben Widawsky
2013-08-06 22:59           ` Daniel Vetter
2013-08-06 23:25             ` Ben Widawsky
2013-08-06 23:44               ` Daniel Vetter
2013-08-07 18:24                 ` Ben Widawsky
2013-08-01  0:00 ` [PATCH 19/29] drm/i915: turn bound_ggtt checks to bound_any Ben Widawsky
2013-08-03 11:03   ` Chris Wilson
2013-08-03 22:26     ` Ben Widawsky
2013-08-06 18:43   ` Daniel Vetter
2013-08-06 21:29     ` Ben Widawsky
2013-08-01  0:00 ` [PATCH 20/29] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
2013-08-06 19:11   ` Daniel Vetter
2013-08-07 18:37     ` Ben Widawsky
2013-08-07 20:32       ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 21/29] drm/i915: mm_list is per VMA Ben Widawsky
2013-08-06 19:38   ` Daniel Vetter
2013-08-07  0:28     ` Ben Widawsky
2013-08-07 20:52       ` Daniel Vetter
2013-08-08  4:32         ` Ben Widawsky
2013-08-08  6:46           ` Daniel Vetter
2013-08-08 18:10             ` Ben Widawsky
2013-08-01  0:00 ` [PATCH 22/29] drm/i915: Update error capture for VMs Ben Widawsky
2013-08-01  0:00 ` [PATCH 23/29] drm/i915: Add vma to list at creation Ben Widawsky
2013-08-01  0:00 ` [PATCH 24/29] drm/i915: create vmas at execbuf Ben Widawsky
2013-08-07 20:52   ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 25/29] drm/i915: Convert execbuf code to use vmas Ben Widawsky
2013-08-06 20:43   ` Daniel Vetter
2013-08-06 20:45     ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 26/29] drm/i915: Convert active API to VMA Ben Widawsky
2013-08-06 20:47   ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 27/29] drm/i915: Add bind/unbind object functions to VM Ben Widawsky
2013-08-01  0:00 ` [PATCH 28/29] drm/i915: Use the new vm [un]bind functions Ben Widawsky
2013-08-06 20:58   ` Daniel Vetter
2013-08-01  0:00 ` [PATCH 29/29] drm/i915: eliminate vm->insert_entries() Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.