All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] Completion of i915 VMAs
@ 2013-07-22  2:08 Ben Widawsky
  2013-07-22  2:08 ` [PATCH 01/12] drm/i915: plumb VM into object operations Ben Widawsky
                   ` (12 more replies)
  0 siblings, 13 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Included in this series are the unmerged patches from the VMA only
version of my post [1] as well as two new sets of patches on top. Daniel
has already merged the first part of the series which introduced the
VMAs, and VMs, but didn't actually use them ([2]).

As agreed up in the previous mail thread ([1]), I've put in a couple
things on top of the old series to make the end result what Daniel was
looking for; and I also agree it's an improvement. (One downside however
is I can no longer use the original 66 patch series [3] as a direct
reference) objective was to not throw away all the testing I had done on
the previous work, make the changes easier to review, but still get to
where we needed to get to.  The diff churn is a bit higher than if I had
just rebased it in.

The two big things this adds from the last go around are:
1. map/unmap_vma function pointers for the VM.
2. conversion of some <obj,vm> -> vma in functions.

Map, and unmap are logical functionalities to add for an address space.
They do more or less what you'd think: take an object and create a
mapping via the GPU's page tables to that object. Of course, without the
rest of the patches from [3], there will only ever be 1 address space,
with the weird aliasing ppgtt behind it. One thing which I toyed with,
but opted not to include was to directly pass obj,vm to map/unmap
instead of doing the slightly less pretty way as I've done in execbuf
and bind. In the future I think I may just do this, but for now it's not
a big win as the end result wasn't much better (and I didn't get it to
immediately work).

The conversion of <obj,vm> to VMA was a lot of churn, but the idea is
simple. In the original series [1,3], I passed the pair of an object and
an address space everywhere. Every time I needed to convert that into a
VMA, it requires walking a list. In fact, we only need to walk the list
once - GEM is still all about BOs, and I have no intention of changing
this - so we must walk the list at user space entry points; but from
thereon it can all be a VMA. The caveat is, we do have a few internal
APIs that are easily broken unless we talk in objects (caching is one of
them). As I mentioned in the original series, we may eventually want to
move things over to be all in the VMA. For now, I think this is fine as
it stands. You'll notice unbind() to be a wart currently - but the
alternative looked worse to my eyes.

Breakdown:
The first 4 patches are basically what I sent in [1], except I've
squashed some stuff, rebased, and applied some requested fixups, which
brought the patch count down from 6->4.

The next 3 patches are about the map/unmap mentioned in #1.

And the final 5 patches are for replacing <obj,vm> pairs with vma, #2.

Testing:
IGT looks pretty stable on IVB. I was having a lot of issues on gpu reset
on my ILK, but somehow after a clean build, I stopped seeing it. I've
done nothing else.

References:
[1] http://lists.freedesktop.org/archives/intel-gfx/2013-July/029974.html
[2] http://lists.freedesktop.org/archives/intel-gfx/2013-July/030395.html
[3] http://lists.freedesktop.org/archives/intel-gfx/2013-June/029408.html

I've pushed a badly rebased onto -nightly here (not worth fixing):
http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=ppgtt2

---

Ben Widawsky (12):
  drm/i915: plumb VM into object operations
  drm/i915: Fix up map and fenceable for VMA
  drm/i915: Update error capture for VMs
  drm/i915: Track active by VMA instead of object
  drm/i915: Add map/unmap object functions to VM
  drm/i915: Use the new vm [un]bind functions
  drm/i915: eliminate vm->insert_entries()
  drm/i915: Add vma to list at creation
  drm/i915: create vmas at execbuf
  drm/i915: Convert execbuf code to use vmas
  drm/i915: Convert object coloring to VMA
  drm/i915: Convert active API to VMA

 drivers/gpu/drm/i915/i915_debugfs.c        |  68 +++--
 drivers/gpu/drm/i915/i915_dma.c            |   4 -
 drivers/gpu/drm/i915/i915_drv.h            | 193 +++++++------
 drivers/gpu/drm/i915/i915_gem.c            | 433 ++++++++++++++++++++---------
 drivers/gpu/drm/i915/i915_gem_context.c    |  17 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  78 +++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 321 +++++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 140 ++++++----
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  10 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      | 111 +++++---
 drivers/gpu/drm/i915/i915_trace.h          |  20 +-
 drivers/gpu/drm/i915/intel_fb.c            |   1 -
 drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
 drivers/gpu/drm/i915/intel_pm.c            |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
 16 files changed, 880 insertions(+), 546 deletions(-)

-- 
1.8.3.3

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 01/12] drm/i915: plumb VM into object operations
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-23 16:37   ` Daniel Vetter
  2013-07-26  9:51   ` Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations) Daniel Vetter
  2013-07-22  2:08 ` [PATCH 02/12] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This patch was formerly known as:
"drm/i915: Create VMAs (part 3) - plumbing"

This patch adds a VM argument, bind/unbind, and the object
offset/size/color getters/setters. It preserves the old ggtt helper
functions because things still need, and will continue to need them.

Some code will still need to be ported over after this.

v2: Fix purge to pick an object and unbind all vmas
This was doable because of the global bound list change.

v3: With the commit to actually pin/unpin pages in place, there is no
longer a need to check if unbind succeeded before calling put_pages().
Make put_pages only BUG() after checking pin count.

v4: Rebased on top of the new hangcheck work by Mika
plumbed eb_destroy also
Many checkpatch related fixes

v5: Very large rebase

v6:
Change BUG_ON to WARN_ON (Daniel)
Rename vm to ggtt in preallocate stolen, since it is always ggtt when
dealing with stolen memory. (Daniel)
list_for_each will short-circuit already (Daniel)
remove superflous space (Daniel)
Use per object list of vmas (Daniel)
Make obj_bound_any() use obj_bound for each vm (Ben)
s/bind_to_gtt/bind_to_vm/ (Ben)

Fixed up the inactive shrinker. As Daniel noticed the code could
potentially count the same object multiple times. While it's not
possible in the current case, since 1 object can only ever be bound into
1 address space thus far - we may as well try to get something more
future proof in place now. With a prep patch before this to switch over
to using the bound list + inactive check, we're now able to carry that
forward for every address space an object is bound into.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  29 ++-
 drivers/gpu/drm/i915/i915_dma.c            |   4 -
 drivers/gpu/drm/i915/i915_drv.h            | 107 +++++----
 drivers/gpu/drm/i915/i915_gem.c            | 337 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  10 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
 drivers/gpu/drm/i915/i915_trace.h          |  20 +-
 drivers/gpu/drm/i915/intel_fb.c            |   1 -
 drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
 drivers/gpu/drm/i915/intel_pm.c            |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
 15 files changed, 479 insertions(+), 245 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index be69807..f8e590f 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -92,6 +92,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj)
 static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
+	struct i915_vma *vma;
 	seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s",
 		   &obj->base,
 		   get_pin_flag(obj),
@@ -111,9 +112,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (pinned x %d)", obj->pin_count);
 	if (obj->fence_reg != I915_FENCE_REG_NONE)
 		seq_printf(m, " (fence: %d)", obj->fence_reg);
-	if (i915_gem_obj_ggtt_bound(obj))
-		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
-			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
+		if (!i915_is_ggtt(vma->vm))
+			seq_puts(m, " (pp");
+		else
+			seq_puts(m, " (g");
+		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
+			   i915_gem_obj_offset(obj, vma->vm),
+			   i915_gem_obj_size(obj, vma->vm));
+	}
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
 	if (obj->pin_mappable || obj->fault_mappable) {
@@ -175,6 +182,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+/* FIXME: Support multiple VM? */
 #define count_objects(list, member) do { \
 	list_for_each_entry(obj, list, member) { \
 		size += i915_gem_obj_ggtt_size(obj); \
@@ -1781,18 +1789,21 @@ i915_drop_caches_set(void *data, u64 val)
 
 	if (val & DROP_BOUND) {
 		list_for_each_entry_safe(obj, next, &vm->inactive_list,
-					 mm_list)
-			if (obj->pin_count == 0) {
-				ret = i915_gem_object_unbind(obj);
-				if (ret)
-					goto unlock;
-			}
+					 mm_list) {
+			if (obj->pin_count)
+				continue;
+
+			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
+			if (ret)
+				goto unlock;
+		}
 	}
 
 	if (val & DROP_UNBOUND) {
 		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
 					 global_list)
 			if (obj->pages_pin_count == 0) {
+				/* FIXME: Do this for all vms? */
 				ret = i915_gem_object_put_pages(obj);
 				if (ret)
 					goto unlock;
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 1449d06..4650519 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	i915_dump_device_info(dev_priv);
 
-	INIT_LIST_HEAD(&dev_priv->vm_list);
-	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
-	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
-
 	if (i915_get_bridge_dev(dev)) {
 		ret = -EIO;
 		goto free_priv;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8b3167e..681cb41 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1379,52 +1379,6 @@ struct drm_i915_gem_object {
 
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
-/* This is a temporary define to help transition us to real VMAs. If you see
- * this, you're either reviewing code, or bisecting it. */
-static inline struct i915_vma *
-__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
-{
-	if (list_empty(&obj->vma_list))
-		return NULL;
-	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
-}
-
-/* Whether or not this object is currently mapped by the translation tables */
-static inline bool
-i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
-{
-	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
-	if (vma == NULL)
-		return false;
-	return drm_mm_node_allocated(&vma->node);
-}
-
-/* Offset of the first PTE pointing to this object */
-static inline unsigned long
-i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
-{
-	BUG_ON(list_empty(&o->vma_list));
-	return __i915_gem_obj_to_vma(o)->node.start;
-}
-
-/* The size used in the translation tables may be larger than the actual size of
- * the object on GEN2/GEN3 because of the way tiling is handled. See
- * i915_gem_get_gtt_size() for more details.
- */
-static inline unsigned long
-i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
-{
-	BUG_ON(list_empty(&o->vma_list));
-	return __i915_gem_obj_to_vma(o)->node.size;
-}
-
-static inline void
-i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
-			    enum i915_cache_level color)
-{
-	__i915_gem_obj_to_vma(o)->node.color = color;
-}
-
 /**
  * Request queue structure.
  *
@@ -1736,11 +1690,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 void i915_gem_vma_destroy(struct i915_vma *vma);
 
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
-int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
+int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
+					struct i915_address_space *vm);
 int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
 void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
@@ -1770,6 +1726,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
 void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    struct intel_ring_buffer *ring);
 
 int i915_gem_dumb_create(struct drm_file *file_priv,
@@ -1876,6 +1833,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
 			    int tiling_mode, bool fenced);
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    enum i915_cache_level cache_level);
 
 struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
@@ -1886,6 +1844,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
 
 void i915_gem_restore_fences(struct drm_device *dev);
 
+unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
+				  struct i915_address_space *vm);
+bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
+bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
+			struct i915_address_space *vm);
+unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
+				struct i915_address_space *vm);
+void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
+			    struct i915_address_space *vm,
+			    enum i915_cache_level color);
+struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm);
+/* Some GGTT VM helpers */
+#define obj_to_ggtt(obj) \
+	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
+static inline bool i915_is_ggtt(struct i915_address_space *vm)
+{
+	struct i915_address_space *ggtt =
+		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
+	return vm == ggtt;
+}
+
+static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
+}
+
+static inline unsigned long
+i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
+}
+
+static inline unsigned long
+i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
+}
+
+static inline int __must_check
+i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
+		  uint32_t alignment,
+		  bool map_and_fenceable,
+		  bool nonblocking)
+{
+	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
+				   map_and_fenceable, nonblocking);
+}
+#undef obj_to_ggtt
+
 /* i915_gem_context.c */
 void i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
@@ -1922,6 +1930,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
+/* FIXME: this is never okay with full PPGTT */
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 				enum i915_cache_level cache_level);
 void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
@@ -1938,7 +1947,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
 
 
 /* i915_gem_evict.c */
-int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
+int __must_check i915_gem_evict_something(struct drm_device *dev,
+					  struct i915_address_space *vm,
+					  int min_size,
 					  unsigned alignment,
 					  unsigned cache_level,
 					  bool mappable,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2283765..0111554 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -38,10 +38,12 @@
 
 static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
-static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
-						    unsigned alignment,
-						    bool map_and_fenceable,
-						    bool nonblocking);
+static __must_check int
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -120,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
+	return i915_gem_obj_bound_any(obj) && !obj->active;
 }
 
 int
@@ -406,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		 * anyway again before the next pread happens. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush = 1;
-		if (i915_gem_obj_ggtt_bound(obj)) {
+		if (i915_gem_obj_bound_any(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, false);
 			if (ret)
 				return ret;
@@ -578,7 +580,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	char __user *user_data;
 	int page_offset, page_length, ret;
 
-	ret = i915_gem_object_pin(obj, 0, true, true);
+	ret = i915_gem_ggtt_pin(obj, 0, true, true);
 	if (ret)
 		goto out;
 
@@ -723,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		 * right away and we therefore have to clflush anyway. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush_after = 1;
-		if (i915_gem_obj_ggtt_bound(obj)) {
+		if (i915_gem_obj_bound_any(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, true);
 			if (ret)
 				return ret;
@@ -1332,7 +1334,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	}
 
 	/* Now bind it into the GTT if needed */
-	ret = i915_gem_object_pin(obj, 0, true, false);
+	ret = i915_gem_ggtt_pin(obj,  0, true, false);
 	if (ret)
 		goto unlock;
 
@@ -1654,11 +1656,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	if (obj->pages == NULL)
 		return 0;
 
-	BUG_ON(i915_gem_obj_ggtt_bound(obj));
-
 	if (obj->pages_pin_count)
 		return -EBUSY;
 
+	BUG_ON(i915_gem_obj_bound_any(obj));
+
 	/* ->put_pages might need to allocate memory for the bit17 swizzle
 	 * array, hence protect them from being reaped by removing them from gtt
 	 * lists early. */
@@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		  bool purgeable_only)
 {
 	struct drm_i915_gem_object *obj, *next;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	long count = 0;
 
 	list_for_each_entry_safe(obj, next,
@@ -1692,14 +1693,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		}
 	}
 
-	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
-		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
-		    i915_gem_object_unbind(obj) == 0 &&
-		    i915_gem_object_put_pages(obj) == 0) {
+	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
+				 global_list) {
+		struct i915_vma *vma, *v;
+
+		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
+			continue;
+
+		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
+			if (i915_gem_object_unbind(obj, vma->vm))
+				break;
+
+		if (!i915_gem_object_put_pages(obj))
 			count += obj->base.size >> PAGE_SHIFT;
-			if (count >= target)
-				return count;
-		}
+
+		if (count >= target)
+			return count;
 	}
 
 	return count;
@@ -1859,11 +1868,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 
 void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
+			       struct i915_address_space *vm,
 			       struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 seqno = intel_ring_get_seqno(ring);
 
 	BUG_ON(ring == NULL);
@@ -1900,12 +1909,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
+i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
+				 struct i915_address_space *vm)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
-
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
@@ -2105,10 +2111,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 	spin_unlock(&file_priv->mm.lock);
 }
 
-static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
+static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm)
 {
-	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
-	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
+	if (acthd >= i915_gem_obj_offset(obj, vm) &&
+	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
 		return true;
 
 	return false;
@@ -2131,6 +2138,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
 	return false;
 }
 
+static struct i915_address_space *
+request_to_vm(struct drm_i915_gem_request *request)
+{
+	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
+	struct i915_address_space *vm;
+
+	vm = &dev_priv->gtt.base;
+
+	return vm;
+}
+
 static bool i915_request_guilty(struct drm_i915_gem_request *request,
 				const u32 acthd, bool *inside)
 {
@@ -2138,9 +2156,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
 	 * pointing inside the ring, matches the batch_obj address range.
 	 * However this is extremely unlikely.
 	 */
-
 	if (request->batch_obj) {
-		if (i915_head_inside_object(acthd, request->batch_obj)) {
+		if (i915_head_inside_object(acthd, request->batch_obj,
+					    request_to_vm(request))) {
 			*inside = true;
 			return true;
 		}
@@ -2160,17 +2178,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
 {
 	struct i915_ctx_hang_stats *hs = NULL;
 	bool inside, guilty;
+	unsigned long offset = 0;
 
 	/* Innocent until proven guilty */
 	guilty = false;
 
+	if (request->batch_obj)
+		offset = i915_gem_obj_offset(request->batch_obj,
+					     request_to_vm(request));
+
 	if (ring->hangcheck.action != wait &&
 	    i915_request_guilty(request, acthd, &inside)) {
 		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
 			  ring->name,
 			  inside ? "inside" : "flushing",
-			  request->batch_obj ?
-			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
+			  offset,
 			  request->ctx ? request->ctx->id : 0,
 			  acthd);
 
@@ -2227,13 +2249,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
 	}
 
 	while (!list_empty(&ring->active_list)) {
+		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
 				       struct drm_i915_gem_object,
 				       ring_list);
 
-		i915_gem_object_move_to_inactive(obj);
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+			i915_gem_object_move_to_inactive(obj, vm);
 	}
 }
 
@@ -2261,7 +2285,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
 	int i;
@@ -2272,8 +2296,9 @@ void i915_gem_reset(struct drm_device *dev)
 	/* Move everything out of the GPU domains to ensure we do any
 	 * necessary invalidation upon reuse.
 	 */
-	list_for_each_entry(obj, &vm->inactive_list, mm_list)
-		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		list_for_each_entry(obj, &vm->inactive_list, mm_list)
+			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 
 	i915_gem_restore_fences(dev);
 }
@@ -2318,6 +2343,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 	 * by the ringbuffer to the flushing/inactive lists as appropriate.
 	 */
 	while (!list_empty(&ring->active_list)) {
+		struct drm_i915_private *dev_priv = ring->dev->dev_private;
+		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
@@ -2327,7 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
 			break;
 
-		i915_gem_object_move_to_inactive(obj);
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+			i915_gem_object_move_to_inactive(obj, vm);
 	}
 
 	if (unlikely(ring->trace_irq_seqno &&
@@ -2573,13 +2601,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
  * Unbinds an object from the GTT aperture.
  */
 int
-i915_gem_object_unbind(struct drm_i915_gem_object *obj)
+i915_gem_object_unbind(struct drm_i915_gem_object *obj,
+		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (!i915_gem_obj_bound(obj, vm))
 		return 0;
 
 	if (obj->pin_count)
@@ -2602,7 +2631,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	if (ret)
 		return ret;
 
-	trace_i915_gem_object_unbind(obj);
+	trace_i915_gem_object_unbind(obj, vm);
 
 	if (obj->has_global_gtt_mapping)
 		i915_gem_gtt_unbind_object(obj);
@@ -2617,7 +2646,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
-	vma = __i915_gem_obj_to_vma(obj);
+	vma = i915_gem_obj_to_vma(obj, vm);
 	list_del(&vma->vma_link);
 	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
@@ -2764,6 +2793,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
 		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
 		     i915_gem_obj_ggtt_offset(obj), size);
 
+
 		pitch_val = obj->stride / 128;
 		pitch_val = ffs(pitch_val) - 1;
 
@@ -3049,24 +3079,26 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
  * Finds free space in the GTT aperture and binds the object there.
  */
 static int
-i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
-			    unsigned alignment,
-			    bool map_and_fenceable,
-			    bool nonblocking)
+i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   unsigned alignment,
+			   bool map_and_fenceable,
+			   bool nonblocking)
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
-	size_t gtt_max = map_and_fenceable ?
-		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
+	size_t gtt_max =
+		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
 	struct i915_vma *vma;
 	int ret;
 
 	if (WARN_ON(!list_empty(&obj->vma_list)))
 		return -EBUSY;
 
+	BUG_ON(!i915_is_ggtt(vm));
+
 	fence_size = i915_gem_get_gtt_size(dev,
 					   obj->base.size,
 					   obj->tiling_mode);
@@ -3105,19 +3137,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
+	/* For now we only ever use 1 vma per object */
+	WARN_ON(!list_empty(&obj->vma_list));
+
+	vma = i915_gem_vma_create(obj, vm);
 	if (IS_ERR(vma)) {
 		i915_gem_object_unpin_pages(obj);
 		return PTR_ERR(vma);
 	}
 
 search_free:
-	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
-						  &vma->node,
+	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max);
 	if (ret) {
-		ret = i915_gem_evict_something(dev, size, alignment,
+		ret = i915_gem_evict_something(dev, vm, size, alignment,
 					       obj->cache_level,
 					       map_and_fenceable,
 					       nonblocking);
@@ -3138,18 +3172,25 @@ search_free:
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &vm->inactive_list);
-	list_add(&vma->vma_link, &obj->vma_list);
+
+	/* Keep GGTT vmas first to make debug easier */
+	if (i915_is_ggtt(vm))
+		list_add(&vma->vma_link, &obj->vma_list);
+	else
+		list_add_tail(&vma->vma_link, &obj->vma_list);
 
 	fenceable =
+		i915_is_ggtt(vm) &&
 		i915_gem_obj_ggtt_size(obj) == fence_size &&
 		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
 
-	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
-		dev_priv->gtt.mappable_end;
+	mappable =
+		i915_is_ggtt(vm) &&
+		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
 	obj->map_and_fenceable = mappable && fenceable;
 
-	trace_i915_gem_object_bind(obj, map_and_fenceable);
+	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
 	i915_gem_verify_gtt(dev);
 	return 0;
 
@@ -3253,7 +3294,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	int ret;
 
 	/* Not valid to be called on unbound objects. */
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (!i915_gem_obj_bound_any(obj))
 		return -EINVAL;
 
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
@@ -3299,11 +3340,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 }
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
 	int ret;
 
 	if (obj->cache_level == cache_level)
@@ -3315,12 +3357,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 	}
 
 	if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
-		ret = i915_gem_object_unbind(obj);
+		ret = i915_gem_object_unbind(obj, vm);
 		if (ret)
 			return ret;
 	}
 
-	if (i915_gem_obj_ggtt_bound(obj)) {
+	list_for_each_entry(vma, &obj->vma_list, vma_link) {
 		ret = i915_gem_object_finish_gpu(obj);
 		if (ret)
 			return ret;
@@ -3343,7 +3385,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
 					       obj, cache_level);
 
-		i915_gem_obj_ggtt_set_color(obj, cache_level);
+		i915_gem_obj_set_color(obj, vma->vm, cache_level);
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -3403,6 +3445,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file)
 {
 	struct drm_i915_gem_caching *args = data;
+	struct drm_i915_private *dev_priv;
 	struct drm_i915_gem_object *obj;
 	enum i915_cache_level level;
 	int ret;
@@ -3427,8 +3470,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 		ret = -ENOENT;
 		goto unlock;
 	}
+	dev_priv = obj->base.dev->dev_private;
 
-	ret = i915_gem_object_set_cache_level(obj, level);
+	/* FIXME: Add interface for specific VM? */
+	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
 
 	drm_gem_object_unreference(&obj->base);
 unlock:
@@ -3446,6 +3491,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     struct intel_ring_buffer *pipelined)
 {
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	u32 old_read_domains, old_write_domain;
 	int ret;
 
@@ -3464,7 +3510,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * of uncaching, which would allow us to flush all the LLC-cached data
 	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
 	 */
-	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
+	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					      I915_CACHE_NONE);
 	if (ret)
 		return ret;
 
@@ -3472,7 +3519,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * (e.g. libkms for the bootup splash), we have to ensure that we
 	 * always use map_and_fenceable for all scanout buffers.
 	 */
-	ret = i915_gem_object_pin(obj, alignment, true, false);
+	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
 	if (ret)
 		return ret;
 
@@ -3615,6 +3662,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 int
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
+		    struct i915_address_space *vm,
 		    uint32_t alignment,
 		    bool map_and_fenceable,
 		    bool nonblocking)
@@ -3624,28 +3672,31 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
 		return -EBUSY;
 
-	if (i915_gem_obj_ggtt_bound(obj)) {
-		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
+	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
+
+	if (i915_gem_obj_bound(obj, vm)) {
+		if ((alignment &&
+		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
 		    (map_and_fenceable && !obj->map_and_fenceable)) {
 			WARN(obj->pin_count,
 			     "bo is already pinned with incorrect alignment:"
 			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
 			     " obj->map_and_fenceable=%d\n",
-			     i915_gem_obj_ggtt_offset(obj), alignment,
+			     i915_gem_obj_offset(obj, vm), alignment,
 			     map_and_fenceable,
 			     obj->map_and_fenceable);
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_gem_object_unbind(obj, vm);
 			if (ret)
 				return ret;
 		}
 	}
 
-	if (!i915_gem_obj_ggtt_bound(obj)) {
+	if (!i915_gem_obj_bound(obj, vm)) {
 		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 
-		ret = i915_gem_object_bind_to_gtt(obj, alignment,
-						  map_and_fenceable,
-						  nonblocking);
+		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
+						 map_and_fenceable,
+						 nonblocking);
 		if (ret)
 			return ret;
 
@@ -3666,7 +3717,7 @@ void
 i915_gem_object_unpin(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pin_count == 0);
-	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
+	BUG_ON(!i915_gem_obj_bound_any(obj));
 
 	if (--obj->pin_count == 0)
 		obj->pin_mappable = false;
@@ -3704,7 +3755,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	}
 
 	if (obj->user_pin_count == 0) {
-		ret = i915_gem_object_pin(obj, args->alignment, true, false);
+		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
 		if (ret)
 			goto out;
 	}
@@ -3937,6 +3988,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_vma *vma, *next;
 
 	trace_i915_gem_object_destroy(obj);
 
@@ -3944,15 +3996,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		i915_gem_detach_phys_object(dev, obj);
 
 	obj->pin_count = 0;
-	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
-		bool was_interruptible;
+	/* NB: 0 or 1 elements */
+	WARN_ON(!list_empty(&obj->vma_list) &&
+		!list_is_singular(&obj->vma_list));
+	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
+		int ret = i915_gem_object_unbind(obj, vma->vm);
+		if (WARN_ON(ret == -ERESTARTSYS)) {
+			bool was_interruptible;
 
-		was_interruptible = dev_priv->mm.interruptible;
-		dev_priv->mm.interruptible = false;
+			was_interruptible = dev_priv->mm.interruptible;
+			dev_priv->mm.interruptible = false;
 
-		WARN_ON(i915_gem_object_unbind(obj));
+			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
 
-		dev_priv->mm.interruptible = was_interruptible;
+			dev_priv->mm.interruptible = was_interruptible;
+		}
 	}
 
 	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
@@ -4319,6 +4377,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
 	INIT_LIST_HEAD(&ring->request_list);
 }
 
+static void i915_init_vm(struct drm_i915_private *dev_priv,
+			 struct i915_address_space *vm)
+{
+	vm->dev = dev_priv->dev;
+	INIT_LIST_HEAD(&vm->active_list);
+	INIT_LIST_HEAD(&vm->inactive_list);
+	INIT_LIST_HEAD(&vm->global_link);
+	list_add(&vm->global_link, &dev_priv->vm_list);
+}
+
 void
 i915_gem_load(struct drm_device *dev)
 {
@@ -4331,8 +4399,9 @@ i915_gem_load(struct drm_device *dev)
 				  SLAB_HWCACHE_ALIGN,
 				  NULL);
 
-	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
-	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
+	INIT_LIST_HEAD(&dev_priv->vm_list);
+	i915_init_vm(dev_priv, &dev_priv->gtt.base);
+
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
@@ -4603,9 +4672,8 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 			     struct drm_i915_private,
 			     mm.inactive_shrinker);
 	struct drm_device *dev = dev_priv->dev;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
-	int nr_to_scan = sc->nr_to_scan;
+	int nr_to_scan;
 	bool unlock = true;
 	int cnt;
 
@@ -4619,6 +4687,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 		unlock = false;
 	}
 
+	nr_to_scan = sc->nr_to_scan;
 	if (nr_to_scan) {
 		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
 		if (nr_to_scan > 0)
@@ -4632,11 +4701,109 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
 		if (obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
-	list_for_each_entry(obj, &vm->inactive_list, mm_list)
+
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		if (obj->active)
+			continue;
+
+		i915_gem_object_flush_gtt_write_domain(obj);
+		i915_gem_object_flush_cpu_write_domain(obj);
+		/* FIXME: Can't assume global gtt */
+		i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base);
+
 		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
+	}
 
 	if (unlock)
 		mutex_unlock(&dev->struct_mutex);
 	return cnt;
 }
+
+/* All the new VM stuff */
+unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
+				  struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_vma *vma;
+
+	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
+		vm = &dev_priv->gtt.base;
+
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm)
+			return vma->node.start;
+
+	}
+	return -1;
+}
+
+bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
+			struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+
+	list_for_each_entry(vma, &o->vma_list, vma_link)
+		if (vma->vm == vm)
+			return true;
+
+	return false;
+}
+
+bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_address_space *vm;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		if (i915_gem_obj_bound(o, vm))
+			return true;
+
+	return false;
+}
+
+unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
+				struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_vma *vma;
+
+	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
+		vm = &dev_priv->gtt.base;
+
+	BUG_ON(list_empty(&o->vma_list));
+
+	list_for_each_entry(vma, &o->vma_list, vma_link)
+		if (vma->vm == vm)
+			return vma->node.size;
+
+	return 0;
+}
+
+void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
+			    struct i915_address_space *vm,
+			    enum i915_cache_level color)
+{
+	struct i915_vma *vma;
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm) {
+			vma->node.color = color;
+			return;
+		}
+	}
+
+	WARN(1, "Couldn't set color for VM %p\n", vm);
+}
+
+struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+	list_for_each_entry(vma, &obj->vma_list, vma_link)
+		if (vma->vm == vm)
+			return vma;
+
+	return NULL;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2470206..873577d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
 
 	if (INTEL_INFO(dev)->gen >= 7) {
 		ret = i915_gem_object_set_cache_level(ctx->obj,
+						      &dev_priv->gtt.base,
 						      I915_CACHE_LLC_MLC);
 		/* Failure shouldn't ever happen this early */
 		if (WARN_ON(ret))
@@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 	 * default context.
 	 */
 	dev_priv->ring[RCS].default_context = ctx;
-	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
+	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
 	if (ret) {
 		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
 		goto err_destroy;
@@ -391,6 +392,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -400,7 +402,7 @@ static int do_switch(struct i915_hw_context *to)
 	if (from == to)
 		return 0;
 
-	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
+	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
 	if (ret)
 		return ret;
 
@@ -437,7 +439,8 @@ static int do_switch(struct i915_hw_context *to)
 	 */
 	if (from != NULL) {
 		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-		i915_gem_object_move_to_active(from->obj, ring);
+		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
+					       ring);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
 		 * object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index df61f33..32efdc0 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -32,24 +32,21 @@
 #include "i915_trace.h"
 
 static bool
-mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
+mark_free(struct i915_vma *vma, struct list_head *unwind)
 {
-	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
-
-	if (obj->pin_count)
+	if (vma->obj->pin_count)
 		return false;
 
-	list_add(&obj->exec_list, unwind);
+	list_add(&vma->obj->exec_list, unwind);
 	return drm_mm_scan_add_block(&vma->node);
 }
 
 int
-i915_gem_evict_something(struct drm_device *dev, int min_size,
-			 unsigned alignment, unsigned cache_level,
+i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
+			 int min_size, unsigned alignment, unsigned cache_level,
 			 bool mappable, bool nonblocking)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct list_head eviction_list, unwind_list;
 	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
@@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 	 */
 
 	INIT_LIST_HEAD(&unwind_list);
-	if (mappable)
+	if (mappable) {
+		BUG_ON(!i915_is_ggtt(vm));
 		drm_mm_init_scan_with_range(&vm->mm, min_size,
 					    alignment, cache_level, 0,
 					    dev_priv->gtt.mappable_end);
-	else
+	} else
 		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
 	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
-		if (mark_free(obj, &unwind_list))
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
 
@@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 
 	/* Now merge in the soon-to-be-expired objects... */
 	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		if (mark_free(obj, &unwind_list))
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
 
@@ -109,7 +109,7 @@ none:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = __i915_gem_obj_to_vma(obj);
+		vma = i915_gem_obj_to_vma(obj, vm);
 		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
@@ -130,7 +130,7 @@ found:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = __i915_gem_obj_to_vma(obj);
+		vma = i915_gem_obj_to_vma(obj, vm);
 		if (drm_mm_scan_remove_block(&vma->node)) {
 			list_move(&obj->exec_list, &eviction_list);
 			drm_gem_object_reference(&obj->base);
@@ -145,7 +145,7 @@ found:
 				       struct drm_i915_gem_object,
 				       exec_list);
 		if (ret == 0)
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_gem_object_unbind(obj, vm);
 
 		list_del_init(&obj->exec_list);
 		drm_gem_object_unreference(&obj->base);
@@ -158,13 +158,18 @@ int
 i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj, *next;
-	bool lists_empty;
+	bool lists_empty = true;
 	int ret;
 
-	lists_empty = (list_empty(&vm->inactive_list) &&
-		       list_empty(&vm->active_list));
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		lists_empty = (list_empty(&vm->inactive_list) &&
+			       list_empty(&vm->active_list));
+		if (!lists_empty)
+			lists_empty = false;
+	}
+
 	if (lists_empty)
 		return -ENOSPC;
 
@@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
 	i915_gem_retire_requests(dev);
 
 	/* Having flushed everything, unbind() should never raise an error */
-	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
-		if (obj->pin_count == 0)
-			WARN_ON(i915_gem_object_unbind(obj));
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
+			if (obj->pin_count == 0)
+				WARN_ON(i915_gem_object_unbind(obj, vm));
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 1734825..819d8d8 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
 }
 
 static void
-eb_destroy(struct eb_objects *eb)
+eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
 {
 	while (!list_empty(&eb->objects)) {
 		struct drm_i915_gem_object *obj;
@@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 				   struct eb_objects *eb,
-				   struct drm_i915_gem_relocation_entry *reloc)
+				   struct drm_i915_gem_relocation_entry *reloc,
+				   struct i915_address_space *vm)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_gem_object *target_obj;
@@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 
 static int
 i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
-				    struct eb_objects *eb)
+				    struct eb_objects *eb,
+				    struct i915_address_space *vm)
 {
 #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
 	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
@@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 		do {
 			u64 offset = r->presumed_offset;
 
-			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
+			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
+								 vm);
 			if (ret)
 				return ret;
 
@@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 static int
 i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 					 struct eb_objects *eb,
-					 struct drm_i915_gem_relocation_entry *relocs)
+					 struct drm_i915_gem_relocation_entry *relocs,
+					 struct i915_address_space *vm)
 {
 	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 	int i, ret;
 
 	for (i = 0; i < entry->relocation_count; i++) {
-		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
+		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
+							 vm);
 		if (ret)
 			return ret;
 	}
@@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate(struct eb_objects *eb)
+i915_gem_execbuffer_relocate(struct eb_objects *eb,
+			     struct i915_address_space *vm)
 {
 	struct drm_i915_gem_object *obj;
 	int ret = 0;
@@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
 	 */
 	pagefault_disable();
 	list_for_each_entry(obj, &eb->objects, exec_list) {
-		ret = i915_gem_execbuffer_relocate_object(obj, eb);
+		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
 		if (ret)
 			break;
 	}
@@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 				   struct intel_ring_buffer *ring,
+				   struct i915_address_space *vm,
 				   bool *need_reloc)
 {
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
@@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->tiling_mode != I915_TILING_NONE;
 	need_mappable = need_fence || need_reloc_mappable(obj);
 
-	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
+	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
+				  false);
 	if (ret)
 		return ret;
 
@@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->has_aliasing_ppgtt_mapping = 1;
 	}
 
-	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
-		entry->offset = i915_gem_obj_ggtt_offset(obj);
+	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
+		entry->offset = i915_gem_obj_offset(obj, vm);
 		*need_reloc = true;
 	}
 
@@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_gem_exec_object2 *entry;
 
-	if (!i915_gem_obj_ggtt_bound(obj))
+	if (!i915_gem_obj_bound_any(obj))
 		return;
 
 	entry = obj->exec_entry;
@@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			    struct list_head *objects,
+			    struct i915_address_space *vm,
 			    bool *need_relocs)
 {
 	struct drm_i915_gem_object *obj;
@@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 		list_for_each_entry(obj, objects, exec_list) {
 			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 			bool need_fence, need_mappable;
+			u32 obj_offset;
 
-			if (!i915_gem_obj_ggtt_bound(obj))
+			if (!i915_gem_obj_bound(obj, vm))
 				continue;
 
+			obj_offset = i915_gem_obj_offset(obj, vm);
 			need_fence =
 				has_fenced_gpu_access &&
 				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 				obj->tiling_mode != I915_TILING_NONE;
 			need_mappable = need_fence || need_reloc_mappable(obj);
 
+			BUG_ON((need_mappable || need_fence) &&
+			       !i915_is_ggtt(vm));
+
 			if ((entry->alignment &&
-			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
+			     obj_offset & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
-				ret = i915_gem_object_unbind(obj);
+				ret = i915_gem_object_unbind(obj, vm);
 			else
-				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
+				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
 				goto err;
 		}
 
 		/* Bind fresh objects */
 		list_for_each_entry(obj, objects, exec_list) {
-			if (i915_gem_obj_ggtt_bound(obj))
+			if (i915_gem_obj_bound(obj, vm))
 				continue;
 
-			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
+			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
 				goto err;
 		}
@@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 				  struct drm_file *file,
 				  struct intel_ring_buffer *ring,
 				  struct eb_objects *eb,
-				  struct drm_i915_gem_exec_object2 *exec)
+				  struct drm_i915_gem_exec_object2 *exec,
+				  struct i915_address_space *vm)
 {
 	struct drm_i915_gem_relocation_entry *reloc;
 	struct drm_i915_gem_object *obj;
@@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 		goto err;
 
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
 	if (ret)
 		goto err;
 
 	list_for_each_entry(obj, &eb->objects, exec_list) {
 		int offset = obj->exec_entry - exec;
 		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
-							       reloc + reloc_offset[offset]);
+							       reloc + reloc_offset[offset],
+							       vm);
 		if (ret)
 			goto err;
 	}
@@ -770,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
 
 static void
 i915_gem_execbuffer_move_to_active(struct list_head *objects,
+				   struct i915_address_space *vm,
 				   struct intel_ring_buffer *ring)
 {
 	struct drm_i915_gem_object *obj;
@@ -784,7 +801,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
 		obj->base.read_domains = obj->base.pending_read_domains;
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
-		i915_gem_object_move_to_active(obj, ring);
+		i915_gem_object_move_to_active(obj, vm, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
@@ -838,7 +855,8 @@ static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec)
+		       struct drm_i915_gem_exec_object2 *exec,
+		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct eb_objects *eb;
@@ -1000,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	/* Move the objects en-masse into the GTT, evicting if necessary. */
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
 	if (ret)
 		goto err;
 
 	/* The objects are in their final locations, apply the relocations. */
 	if (need_relocs)
-		ret = i915_gem_execbuffer_relocate(eb);
+		ret = i915_gem_execbuffer_relocate(eb, vm);
 	if (ret) {
 		if (ret == -EFAULT) {
 			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
-								eb, exec);
+								eb, exec, vm);
 			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 		}
 		if (ret)
@@ -1061,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
+	exec_start = i915_gem_obj_offset(batch_obj, vm) +
+		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
@@ -1086,11 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
-	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
+	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
 	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
 
 err:
-	eb_destroy(eb);
+	eb_destroy(eb, vm);
 
 	mutex_unlock(&dev->struct_mutex);
 
@@ -1107,6 +1126,7 @@ int
 i915_gem_execbuffer(struct drm_device *dev, void *data,
 		    struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer *args = data;
 	struct drm_i915_gem_execbuffer2 exec2;
 	struct drm_i915_gem_exec_object *exec_list = NULL;
@@ -1162,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 	exec2.flags = I915_EXEC_RENDER;
 	i915_execbuffer2_set_context_id(exec2, 0);
 
-	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
+	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
+				     &dev_priv->gtt.base);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		for (i = 0; i < args->buffer_count; i++)
@@ -1188,6 +1209,7 @@ int
 i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		     struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
 	int ret;
@@ -1218,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
+	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
+				     &dev_priv->gtt.base);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3b639a9..44f3464 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -390,6 +390,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
 			    ppgtt->base.total);
 	}
 
+	/* i915_init_vm(dev_priv, &ppgtt->base) */
+
 	return ret;
 }
 
@@ -409,17 +411,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level)
 {
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
+	struct i915_address_space *vm = &ppgtt->base;
+	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
+
+	vm->insert_entries(vm, obj->pages,
+			   obj_offset >> PAGE_SHIFT,
+			   cache_level);
 }
 
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
+	struct i915_address_space *vm = &ppgtt->base;
+	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
+
+	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
+			obj->base.size >> PAGE_SHIFT);
 }
 
 extern int intel_iommu_gfx_mapped;
@@ -470,6 +477,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       dev_priv->gtt.base.start / PAGE_SIZE,
 				       dev_priv->gtt.base.total / PAGE_SIZE);
 
+	if (dev_priv->mm.aliasing_ppgtt)
+		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
+
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		i915_gem_clflush_object(obj);
 		i915_gem_gtt_bind_object(obj, obj->cache_level);
@@ -648,7 +658,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	 * aperture.  One page should be enough to keep any prefetching inside
 	 * of the aperture.
 	 */
-	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
 	struct drm_mm_node *entry;
 	struct drm_i915_gem_object *obj;
 	unsigned long hole_start, hole_end;
@@ -656,19 +667,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	BUG_ON(mappable_end > end);
 
 	/* Subtract the guard page ... */
-	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
+	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
 	if (!HAS_LLC(dev))
 		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
 		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
 			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
 
 		WARN_ON(i915_gem_obj_ggtt_bound(obj));
-		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
+		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
 		if (ret)
 			DRM_DEBUG_KMS("Reservation failed\n");
 		obj->has_global_gtt_mapping = 1;
@@ -679,19 +690,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	dev_priv->gtt.base.total = end - start;
 
 	/* Clear any non-preallocated blocks */
-	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
-			     hole_start, hole_end) {
+	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
 		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
 			      hole_start, hole_end);
-		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-					       hole_start / PAGE_SIZE,
-					       count);
+		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
 	}
 
 	/* And finally clear the reserved guard page */
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       end / PAGE_SIZE - 1, 1);
+	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
 }
 
 static bool
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 27ffb4c..000ffbd 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					       u32 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *ggtt = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	struct i915_vma *vma;
@@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (gtt_offset == I915_GTT_OFFSET_NONE)
 		return obj;
 
-	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
+	vma = i915_gem_vma_create(obj, ggtt);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
 		goto err_out;
@@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	 */
 	vma->node.start = gtt_offset;
 	vma->node.size = size;
-	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
-		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
+	if (drm_mm_initialized(&ggtt->mm)) {
+		ret = drm_mm_reserve_node(&ggtt->mm, &vma->node);
 		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
 			i915_gem_vma_destroy(vma);
@@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
 
 	return obj;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 92a8d27..808ca2a 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 
 		obj->map_and_fenceable =
 			!i915_gem_obj_ggtt_bound(obj) ||
-			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
+			(i915_gem_obj_ggtt_offset(obj) +
+			 obj->base.size <= dev_priv->gtt.mappable_end &&
 			 i915_gem_object_fence_ok(obj, args->tiling_mode));
 
 		/* Rebind if we need a change of alignment */
 		if (!obj->map_and_fenceable) {
-			u32 unfenced_alignment =
+			struct i915_address_space *ggtt = &dev_priv->gtt.base;
+			u32 unfenced_align =
 				i915_gem_get_gtt_alignment(dev, obj->base.size,
 							    args->tiling_mode,
 							    false);
-			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
-				ret = i915_gem_object_unbind(obj);
+			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
+				ret = i915_gem_object_unbind(obj, ggtt);
 		}
 
 		if (ret == 0) {
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 7d283b5..3f019d3 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
 );
 
 TRACE_EVENT(i915_gem_object_bind,
-	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
-	    TP_ARGS(obj, mappable),
+	    TP_PROTO(struct drm_i915_gem_object *obj,
+		     struct i915_address_space *vm, bool mappable),
+	    TP_ARGS(obj, vm, mappable),
 
 	    TP_STRUCT__entry(
 			     __field(struct drm_i915_gem_object *, obj)
+			     __field(struct i915_address_space *, vm)
 			     __field(u32, offset)
 			     __field(u32, size)
 			     __field(bool, mappable)
@@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
 
 	    TP_fast_assign(
 			   __entry->obj = obj;
-			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
-			   __entry->size = i915_gem_obj_ggtt_size(obj);
+			   __entry->offset = i915_gem_obj_offset(obj, vm);
+			   __entry->size = i915_gem_obj_size(obj, vm);
 			   __entry->mappable = mappable;
 			   ),
 
@@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
 );
 
 TRACE_EVENT(i915_gem_object_unbind,
-	    TP_PROTO(struct drm_i915_gem_object *obj),
-	    TP_ARGS(obj),
+	    TP_PROTO(struct drm_i915_gem_object *obj,
+		     struct i915_address_space *vm),
+	    TP_ARGS(obj, vm),
 
 	    TP_STRUCT__entry(
 			     __field(struct drm_i915_gem_object *, obj)
+			     __field(struct i915_address_space *, vm)
 			     __field(u32, offset)
 			     __field(u32, size)
 			     ),
 
 	    TP_fast_assign(
 			   __entry->obj = obj;
-			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
-			   __entry->size = i915_gem_obj_ggtt_size(obj);
+			   __entry->offset = i915_gem_obj_offset(obj, vm);
+			   __entry->size = i915_gem_obj_size(obj, vm);
 			   ),
 
 	    TP_printk("obj=%p, offset=%08x size=%x",
diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
index f3c97e0..b69cc63 100644
--- a/drivers/gpu/drm/i915/intel_fb.c
+++ b/drivers/gpu/drm/i915/intel_fb.c
@@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
 		      fb->width, fb->height,
 		      i915_gem_obj_ggtt_offset(obj), obj);
 
-
 	mutex_unlock(&dev->struct_mutex);
 	vga_switcheroo_client_fb_set(dev->pdev, info);
 	return 0;
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 2abb53e..22ccb7e 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
 		}
 		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
 	} else {
-		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
+		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
 		if (ret) {
 			DRM_ERROR("failed to pin overlay register bo\n");
 			goto out_free_bo;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 008e0e0..0fb081c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev)
 		return NULL;
 	}
 
-	ret = i915_gem_object_pin(ctx, 4096, true, false);
+	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
 	if (ret) {
 		DRM_ERROR("failed to pin power context: %d\n", ret);
 		goto err_unref;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8527ea0..88130a3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -481,6 +481,7 @@ out:
 static int
 init_pipe_control(struct intel_ring_buffer *ring)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct pipe_control *pc;
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
 		goto err;
 	}
 
-	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true, false);
+	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
 	if (ret)
 		goto err_unref;
 
@@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
 static int init_status_page(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	int ret;
 
@@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
 		goto err;
 	}
 
-	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true, false);
+	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
 	if (ret != 0) {
 		goto err_unref;
 	}
@@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 
 	ring->obj = obj;
 
-	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
+	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
 	if (ret)
 		goto err_unref;
 
@@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 			return -ENOMEM;
 		}
 
-		ret = i915_gem_object_pin(obj, 0, true, false);
+		ret = i915_gem_ggtt_pin(obj, 0, true, false);
 		if (ret != 0) {
 			drm_gem_object_unreference(&obj->base);
 			DRM_ERROR("Failed to ping batch bo\n");
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 02/12] drm/i915: Fix up map and fenceable for VMA
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
  2013-07-22  2:08 ` [PATCH 01/12] drm/i915: plumb VM into object operations Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-23 16:42   ` Daniel Vetter
  2013-07-22  2:08 ` [PATCH 03/12] drm/i915: Update error capture for VMs Ben Widawsky
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
tracking"

The map_and_fenceable tracking is per object. GTT mapping, and fences
only apply to global GTT. As such,  object operations which are not
performed on the global GTT should not effect mappable or fenceable
characteristics.

Functionally, this commit could very well be squashed in to the previous
patch which updated object operations to take a VM argument.  This
commit is split out because it's a bit tricky (or at least it was for
me).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c    | 53 ++++++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_drv.h        |  5 ++--
 drivers/gpu/drm/i915/i915_gem.c        | 43 +++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_evict.c  | 14 ++++-----
 drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c  | 37 ++++++++++++++----------
 6 files changed, 93 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f8e590f..0b7df6c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -144,7 +144,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	size_t total_obj_size, total_gtt_size;
 	int count, ret;
 
@@ -152,6 +152,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	if (ret)
 		return ret;
 
+	/* FIXME: the user of this interface might want more than just GGTT */
 	switch (list) {
 	case ACTIVE_LIST:
 		seq_puts(m, "Active:\n");
@@ -167,12 +168,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	}
 
 	total_obj_size = total_gtt_size = count = 0;
-	list_for_each_entry(obj, head, mm_list) {
-		seq_puts(m, "   ");
-		describe_obj(m, obj);
-		seq_putc(m, '\n');
-		total_obj_size += obj->base.size;
-		total_gtt_size += i915_gem_obj_ggtt_size(obj);
+	list_for_each_entry(vma, head, mm_list) {
+		seq_printf(m, "   ");
+		describe_obj(m, vma->obj);
+		seq_printf(m, "\n");
+		total_obj_size += vma->obj->base.size;
+		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
 		count++;
 	}
 	mutex_unlock(&dev->struct_mutex);
@@ -220,7 +221,18 @@ static int per_file_stats(int id, void *ptr, void *data)
 	return 0;
 }
 
-static int i915_gem_object_info(struct seq_file *m, void *data)
+#define count_vmas(list, member) do { \
+	list_for_each_entry(vma, list, member) { \
+		size += i915_gem_obj_ggtt_size(vma->obj); \
+		++count; \
+		if (vma->obj->map_and_fenceable) { \
+			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
+			++mappable_count; \
+		} \
+	} \
+} while (0)
+
+static int i915_gem_object_info(struct seq_file *m, void* data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
@@ -230,6 +242,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_file *file;
+	struct i915_vma *vma;
 	int ret;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -249,12 +262,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&vm->active_list, mm_list);
+	count_vmas(&vm->active_list, mm_list);
 	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&vm->inactive_list, mm_list);
+	count_vmas(&vm->inactive_list, mm_list);
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -1767,7 +1780,8 @@ i915_drop_caches_set(void *data, u64 val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj, *next;
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
+	struct i915_vma *vma, *x;
 	int ret;
 
 	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
@@ -1788,14 +1802,15 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_retire_requests(dev);
 
 	if (val & DROP_BOUND) {
-		list_for_each_entry_safe(obj, next, &vm->inactive_list,
-					 mm_list) {
-			if (obj->pin_count)
-				continue;
-
-			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
-			if (ret)
-				goto unlock;
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+			list_for_each_entry_safe(vma, x, &vm->inactive_list,
+						 mm_list)
+				if (vma->obj->pin_count == 0) {
+					ret = i915_gem_object_unbind(vma->obj,
+								     vm);
+					if (ret)
+						goto unlock;
+				}
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 681cb41..b208c30 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -541,6 +541,9 @@ struct i915_vma {
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm;
 
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
 	struct list_head vma_link; /* Link in the object's VMA list */
 };
 
@@ -1258,9 +1261,7 @@ struct drm_i915_gem_object {
 	struct drm_mm_node *stolen;
 	struct list_head global_list;
 
-	/** This object's place on the active/inactive lists */
 	struct list_head ring_list;
-	struct list_head mm_list;
 	/** This object's place in the batchbuffer or on the eviction list */
 	struct list_head exec_list;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0111554..6bdf89d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1874,6 +1874,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 seqno = intel_ring_get_seqno(ring);
+	struct i915_vma *vma;
 
 	BUG_ON(ring == NULL);
 	if (obj->ring != ring && obj->last_write_seqno) {
@@ -1889,7 +1890,8 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 
 	/* Move from whatever list we were on to the tail of execution. */
-	list_move_tail(&obj->mm_list, &vm->active_list);
+	vma = i915_gem_obj_to_vma(obj, vm);
+	list_move_tail(&vma->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
 	obj->last_read_seqno = seqno;
@@ -1912,10 +1914,13 @@ static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
 				 struct i915_address_space *vm)
 {
+	struct i915_vma *vma;
+
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_move_tail(&obj->mm_list, &vm->inactive_list);
+	vma = i915_gem_obj_to_vma(obj, vm);
+	list_move_tail(&vma->mm_list, &vm->inactive_list);
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -2285,9 +2290,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_address_space *vm;
-	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	int i;
 
 	for_each_ring(ring, dev_priv, i)
@@ -2297,8 +2302,8 @@ void i915_gem_reset(struct drm_device *dev)
 	 * necessary invalidation upon reuse.
 	 */
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-		list_for_each_entry(obj, &vm->inactive_list, mm_list)
-			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
+		list_for_each_entry(vma, &vm->inactive_list, mm_list)
+			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 
 	i915_gem_restore_fences(dev);
 }
@@ -2633,7 +2638,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 
 	trace_i915_gem_object_unbind(obj, vm);
 
-	if (obj->has_global_gtt_mapping)
+	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
 		i915_gem_gtt_unbind_object(obj);
 	if (obj->has_aliasing_ppgtt_mapping) {
 		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
@@ -2642,11 +2647,12 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
-	list_del(&obj->mm_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
-	obj->map_and_fenceable = true;
+	if (i915_is_ggtt(vm))
+		obj->map_and_fenceable = true;
 
 	vma = i915_gem_obj_to_vma(obj, vm);
+	list_del(&vma->mm_list);
 	list_del(&vma->vma_link);
 	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
@@ -3171,7 +3177,7 @@ search_free:
 		goto err_out;
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	list_add_tail(&vma->mm_list, &vm->inactive_list);
 
 	/* Keep GGTT vmas first to make debug easier */
 	if (i915_is_ggtt(vm))
@@ -3188,7 +3194,9 @@ search_free:
 		i915_is_ggtt(vm) &&
 		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
-	obj->map_and_fenceable = mappable && fenceable;
+	/* Map and fenceable only changes if the VM is the global GGTT */
+	if (i915_is_ggtt(vm))
+		obj->map_and_fenceable = mappable && fenceable;
 
 	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
 	i915_gem_verify_gtt(dev);
@@ -3332,9 +3340,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 					    old_write_domain);
 
 	/* And bump the LRU for this access */
-	if (i915_gem_object_is_inactive(obj))
-		list_move_tail(&obj->mm_list,
-			       &dev_priv->gtt.base.inactive_list);
+	if (i915_gem_object_is_inactive(obj)) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
+		if (vma)
+			list_move_tail(&vma->mm_list,
+				       &dev_priv->gtt.base.inactive_list);
+
+	}
 
 	return 0;
 }
@@ -3906,7 +3919,6 @@ unlock:
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops)
 {
-	INIT_LIST_HEAD(&obj->mm_list);
 	INIT_LIST_HEAD(&obj->global_list);
 	INIT_LIST_HEAD(&obj->ring_list);
 	INIT_LIST_HEAD(&obj->exec_list);
@@ -4043,6 +4055,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&vma->vma_link);
+	INIT_LIST_HEAD(&vma->mm_list);
 	vma->vm = vm;
 	vma->obj = obj;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 32efdc0..18a44a9 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
-	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
 		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
@@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 		goto none;
 
 	/* Now merge in the soon-to-be-expired objects... */
-	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+	list_for_each_entry(vma, &vm->active_list, mm_list) {
 		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
@@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_address_space *vm;
-	struct drm_i915_gem_object *obj, *next;
+	struct i915_vma *vma, *next;
 	bool lists_empty = true;
 	int ret;
 
@@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
 
 	/* Having flushed everything, unbind() should never raise an error */
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
-		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
-			if (obj->pin_count == 0)
-				WARN_ON(i915_gem_object_unbind(obj, vm));
+		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
+			if (vma->obj->pin_count == 0)
+				WARN_ON(i915_gem_object_unbind(vma->obj, vm));
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 000ffbd..fa60103 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
+	list_add_tail(&vma->mm_list, &ggtt->inactive_list);
 
 	return obj;
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index d970d84..9623a4e 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -556,11 +556,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 static u32 capture_active_bo(struct drm_i915_error_buffer *err,
 			     int count, struct list_head *head)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	int i = 0;
 
-	list_for_each_entry(obj, head, mm_list) {
-		capture_bo(err++, obj);
+	list_for_each_entry(vma, head, mm_list) {
+		capture_bo(err++, vma->obj);
 		if (++i == count)
 			break;
 	}
@@ -622,7 +622,8 @@ static struct drm_i915_error_object *
 i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			     struct intel_ring_buffer *ring)
 {
-	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	u32 seqno;
 
@@ -642,20 +643,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	}
 
 	seqno = ring->get_seqno(ring, false);
-	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		if (obj->ring != ring)
-			continue;
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_for_each_entry(vma, &vm->active_list, mm_list) {
+			obj = vma->obj;
+			if (obj->ring != ring)
+				continue;
 
-		if (i915_seqno_passed(seqno, obj->last_read_seqno))
-			continue;
+			if (i915_seqno_passed(seqno, obj->last_read_seqno))
+				continue;
 
-		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
-			continue;
+			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
+				continue;
 
-		/* We need to copy these to an anonymous buffer as the simplest
-		 * method to avoid being overwritten by userspace.
-		 */
-		return i915_error_object_create(dev_priv, obj);
+			/* We need to copy these to an anonymous buffer as the simplest
+			 * method to avoid being overwritten by userspace.
+			 */
+			return i915_error_object_create(dev_priv, obj);
+		}
 	}
 
 	return NULL;
@@ -775,11 +779,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 				     struct drm_i915_error_state *error)
 {
 	struct i915_address_space *vm = &dev_priv->gtt.base;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	int i;
 
 	i = 0;
-	list_for_each_entry(obj, &vm->active_list, mm_list)
+	list_for_each_entry(vma, &vm->active_list, mm_list)
 		i++;
 	error->active_bo_count = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 03/12] drm/i915: Update error capture for VMs
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
  2013-07-22  2:08 ` [PATCH 01/12] drm/i915: plumb VM into object operations Ben Widawsky
  2013-07-22  2:08 ` [PATCH 02/12] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-22  2:08 ` [PATCH 04/12] drm/i915: Track active by VMA instead of object Ben Widawsky
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

formerly: "drm/i915: Create VMAs (part 4) - Error capture"

Since the active/inactive lists are per VM, we need to modify the error
capture code to be aware of this, and also extend it to capture the
buffers from all the VMs. For now all the code assumes only 1 VM, but it
will become more generic over the next few patches.

NOTE: If the number of VMs in a real world system grows significantly
we'll have to focus on only capturing the guilty VM, or else it's likely
there won't be enough space for error capture.

v2: Squashed in the "part 6" which had dependencies on the mm_list
change. Since I've moved the mm_list change to an earlier point in the
series, we were able to accomplish it here and now.

v3: Rebased over new error capture

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h       |  4 +-
 drivers/gpu/drm/i915/i915_gpu_error.c | 76 ++++++++++++++++++++++++-----------
 2 files changed, 55 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b208c30..f809204 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -323,8 +323,8 @@ struct drm_i915_error_state {
 		u32 purgeable:1;
 		s32 ring:4;
 		u32 cache_level:2;
-	} *active_bo, *pinned_bo;
-	u32 active_bo_count, pinned_bo_count;
+	} **active_bo, **pinned_bo;
+	u32 *active_bo_count, *pinned_bo_count;
 	struct intel_overlay_error_state *overlay;
 	struct intel_display_error_state *display;
 };
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 9623a4e..b834f78 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -304,13 +304,13 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 
 	if (error->active_bo)
 		print_error_buffers(m, "Active",
-				    error->active_bo,
-				    error->active_bo_count);
+				    error->active_bo[0],
+				    error->active_bo_count[0]);
 
 	if (error->pinned_bo)
 		print_error_buffers(m, "Pinned",
-				    error->pinned_bo,
-				    error->pinned_bo_count);
+				    error->pinned_bo[0],
+				    error->pinned_bo_count[0]);
 
 	for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
 		struct drm_i915_error_object *obj;
@@ -775,42 +775,72 @@ static void i915_gem_record_rings(struct drm_device *dev,
 	}
 }
 
-static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
-				     struct drm_i915_error_state *error)
+/* FIXME: Since pin count/bound list is global, we duplicate what we capture per
+ * VM.
+ */
+static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
+				struct drm_i915_error_state *error,
+				struct i915_address_space *vm,
+				const int ndx)
 {
-	struct i915_address_space *vm = &dev_priv->gtt.base;
-	struct i915_vma *vma;
+	struct drm_i915_error_buffer *active_bo = NULL, *pinned_bo = NULL;
 	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	int i;
 
 	i = 0;
 	list_for_each_entry(vma, &vm->active_list, mm_list)
 		i++;
-	error->active_bo_count = i;
+	error->active_bo_count[ndx] = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
 		if (obj->pin_count)
 			i++;
-	error->pinned_bo_count = i - error->active_bo_count;
+	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
 
 	if (i) {
-		error->active_bo = kmalloc(sizeof(*error->active_bo)*i,
-					   GFP_ATOMIC);
-		if (error->active_bo)
-			error->pinned_bo =
-				error->active_bo + error->active_bo_count;
+		active_bo = kmalloc(sizeof(*active_bo)*i, GFP_ATOMIC);
+		if (active_bo)
+			pinned_bo = active_bo + error->active_bo_count[ndx];
 	}
 
-	if (error->active_bo)
-		error->active_bo_count =
-			capture_active_bo(error->active_bo,
-					  error->active_bo_count,
+	if (active_bo)
+		error->active_bo_count[ndx] =
+			capture_active_bo(active_bo,
+					  error->active_bo_count[ndx],
 					  &vm->active_list);
 
-	if (error->pinned_bo)
-		error->pinned_bo_count =
-			capture_pinned_bo(error->pinned_bo,
-					  error->pinned_bo_count,
+	if (pinned_bo)
+		error->pinned_bo_count[ndx] =
+			capture_pinned_bo(pinned_bo,
+					  error->pinned_bo_count[ndx],
 					  &dev_priv->mm.bound_list);
+	error->active_bo[ndx] = active_bo;
+	error->pinned_bo[ndx] = pinned_bo;
+}
+
+static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
+				     struct drm_i915_error_state *error)
+{
+	struct i915_address_space *vm;
+	int cnt = 0, i = 0;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		cnt++;
+
+	if (WARN(cnt > 1, "Multiple VMs not yet supported\n"))
+		cnt = 1;
+
+	vm = &dev_priv->gtt.base;
+
+	error->active_bo = kcalloc(cnt, sizeof(*error->active_bo), GFP_ATOMIC);
+	error->pinned_bo = kcalloc(cnt, sizeof(*error->pinned_bo), GFP_ATOMIC);
+	error->active_bo_count = kcalloc(cnt, sizeof(*error->active_bo_count),
+					 GFP_ATOMIC);
+	error->pinned_bo_count = kcalloc(cnt, sizeof(*error->pinned_bo_count),
+					 GFP_ATOMIC);
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		i915_gem_capture_vm(dev_priv, error, vm, i++);
 }
 
 /**
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 04/12] drm/i915: Track active by VMA instead of object
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (2 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 03/12] drm/i915: Update error capture for VMs Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-23 16:48   ` Daniel Vetter
  2013-07-22  2:08 ` [PATCH 05/12] drm/i915: Add map/unmap object functions to VM Ben Widawsky
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Even though we want to be able to track active by VMA, the rest of the
code is still using objects for most internal APIs. To solve this,
create an object_is_active() function to help us in converting over to
VMA usage.

Because we intend to keep around some functions that care about objects,
and not VMAs, having this function around will be useful even as we
begin to use VMAs more in function arguments.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 15 +++----
 drivers/gpu/drm/i915/i915_gem.c            | 64 ++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 3 files changed, 48 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f809204..bdce9c1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -541,6 +541,13 @@ struct i915_vma {
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm;
 
+	/**
+	 * This is set if the object is on the active lists (has pending
+	 * rendering and so a non-zero seqno), and is not set if it i s on
+	 * inactive (ready to be unbound) list.
+	 */
+	unsigned int active:1;
+
 	/** This object's place on the active/inactive lists */
 	struct list_head mm_list;
 
@@ -1266,13 +1273,6 @@ struct drm_i915_gem_object {
 	struct list_head exec_list;
 
 	/**
-	 * This is set if the object is on the active lists (has pending
-	 * rendering and so a non-zero seqno), and is not set if it i s on
-	 * inactive (ready to be unbound) list.
-	 */
-	unsigned int active:1;
-
-	/**
 	 * This is set if the object has been written to since last bound
 	 * to the GTT
 	 */
@@ -1726,6 +1726,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
+bool i915_gem_object_is_active(struct drm_i915_gem_object *obj);
 void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 				    struct i915_address_space *vm,
 				    struct intel_ring_buffer *ring);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6bdf89d..9ea6424 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -119,10 +119,22 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 	return 0;
 }
 
+/* NB: Not the same as !i915_gem_object_is_inactive */
+bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
+{
+	struct i915_vma *vma;
+
+	list_for_each_entry(vma, &obj->vma_list, vma_link)
+		if (vma->active)
+			return true;
+
+	return false;
+}
+
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return i915_gem_obj_bound_any(obj) && !obj->active;
+	return i915_gem_obj_bound_any(obj) && !i915_gem_object_is_active(obj);
 }
 
 int
@@ -1883,14 +1895,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 	obj->ring = ring;
 
+	/* Move from whatever list we were on to the tail of execution. */
+	vma = i915_gem_obj_to_vma(obj, vm);
 	/* Add a reference if we're newly entering the active list. */
-	if (!obj->active) {
+	if (!vma->active) {
 		drm_gem_object_reference(&obj->base);
-		obj->active = 1;
+		vma->active = 1;
 	}
 
-	/* Move from whatever list we were on to the tail of execution. */
-	vma = i915_gem_obj_to_vma(obj, vm);
 	list_move_tail(&vma->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
@@ -1911,16 +1923,23 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
-				 struct i915_address_space *vm)
+i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	struct i915_address_space *vm;
 	struct i915_vma *vma;
+	int i = 0;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
-	BUG_ON(!obj->active);
 
-	vma = i915_gem_obj_to_vma(obj, vm);
-	list_move_tail(&vma->mm_list, &vm->inactive_list);
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		vma = i915_gem_obj_to_vma(obj, vm);
+		if (!vma || !vma->active)
+			continue;
+		list_move_tail(&vma->mm_list, &vm->inactive_list);
+		vma->active = 0;
+		i++;
+	}
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -1932,8 +1951,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
 	obj->last_fenced_seqno = 0;
 	obj->fenced_gpu_access = false;
 
-	obj->active = 0;
-	drm_gem_object_unreference(&obj->base);
+	while (i--)
+		drm_gem_object_unreference(&obj->base);
 
 	WARN_ON(i915_verify_lists(dev));
 }
@@ -2254,15 +2273,13 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
 	}
 
 	while (!list_empty(&ring->active_list)) {
-		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
 				       struct drm_i915_gem_object,
 				       ring_list);
 
-		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-			i915_gem_object_move_to_inactive(obj, vm);
+		i915_gem_object_move_to_inactive(obj);
 	}
 }
 
@@ -2348,8 +2365,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 	 * by the ringbuffer to the flushing/inactive lists as appropriate.
 	 */
 	while (!list_empty(&ring->active_list)) {
-		struct drm_i915_private *dev_priv = ring->dev->dev_private;
-		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
@@ -2359,8 +2374,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
 			break;
 
-		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-			i915_gem_object_move_to_inactive(obj, vm);
+		BUG_ON(!i915_gem_object_is_active(obj));
+		i915_gem_object_move_to_inactive(obj);
 	}
 
 	if (unlikely(ring->trace_irq_seqno &&
@@ -2435,7 +2450,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
 {
 	int ret;
 
-	if (obj->active) {
+	if (i915_gem_object_is_active(obj)) {
 		ret = i915_gem_check_olr(obj->ring, obj->last_read_seqno);
 		if (ret)
 			return ret;
@@ -2500,7 +2515,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	if (ret)
 		goto out;
 
-	if (obj->active) {
+	if (i915_gem_object_is_active(obj)) {
 		seqno = obj->last_read_seqno;
 		ring = obj->ring;
 	}
@@ -3850,7 +3865,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 	 */
 	ret = i915_gem_object_flush_active(obj);
 
-	args->busy = obj->active;
+	args->busy = i915_gem_object_is_active(obj);
 	if (obj->ring) {
 		BUILD_BUG_ON(I915_NUM_RINGS > 16);
 		args->busy |= intel_ring_flag(obj->ring) << 16;
@@ -4716,13 +4731,12 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 			cnt += obj->base.size >> PAGE_SHIFT;
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		if (obj->active)
+		if (i915_gem_object_is_active(obj))
 			continue;
 
 		i915_gem_object_flush_gtt_write_domain(obj);
 		i915_gem_object_flush_cpu_write_domain(obj);
-		/* FIXME: Can't assume global gtt */
-		i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base);
+		i915_gem_object_move_to_inactive(obj);
 
 		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 819d8d8..8d2643b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -251,7 +251,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	}
 
 	/* We can't wait for rendering with pagefaults disabled */
-	if (obj->active && in_atomic())
+	if (i915_gem_object_is_active(obj) && in_atomic())
 		return -EFAULT;
 
 	reloc->delta += target_offset;
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 05/12] drm/i915: Add map/unmap object functions to VM
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (3 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 04/12] drm/i915: Track active by VMA instead of object Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-22  2:08 ` [PATCH 06/12] drm/i915: Use the new vm [un]bind functions Ben Widawsky
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->map, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to map object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels
Use VMA for map/unmap, call it map/unmap (Daniel, Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  51 ++++++++++--------
 drivers/gpu/drm/i915/i915_gem_gtt.c | 100 ++++++++++++++++++++++++++++++++++++
 2 files changed, 130 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bdce9c1..f3f2825 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -446,6 +446,27 @@ enum i915_cache_level {
 
 typedef uint32_t gen6_gtt_pte_t;
 
+/* To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+	struct drm_mm_node node;
+	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+
+	/**
+	 * This is set if the object is on the active lists (has pending
+	 * rendering and so a non-zero seqno), and is not set if it i s on
+	 * inactive (ready to be unbound) list.
+	 */
+	unsigned int active:1;
+
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
+	struct list_head vma_link; /* Link in the object's VMA list */
+};
+
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
@@ -484,9 +505,18 @@ struct i915_address_space {
 	/* FIXME: Need a more generic return type */
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
+
+	/** Unmap an object from an address space. This usually consists of
+	 * setting the valid PTE entries to a reserved scratch page. */
+	void (*unmap_vma)(struct i915_vma *vma);
 	void (*clear_range)(struct i915_address_space *vm,
 			    unsigned int first_entry,
 			    unsigned int num_entries);
+	/* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+	void (*map_vma)(struct i915_vma *vma,
+			enum i915_cache_level cache_level,
+			u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
 			       unsigned int first_entry,
@@ -533,27 +563,6 @@ struct i915_hw_ppgtt {
 	int (*enable)(struct drm_device *dev);
 };
 
-/* To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
-	struct drm_mm_node node;
-	struct drm_i915_gem_object *obj;
-	struct i915_address_space *vm;
-
-	/**
-	 * This is set if the object is on the active lists (has pending
-	 * rendering and so a non-zero seqno), and is not set if it i s on
-	 * inactive (ready to be unbound) list.
-	 */
-	unsigned int active:1;
-
-	/** This object's place on the active/inactive lists */
-	struct list_head mm_list;
-
-	struct list_head vma_link; /* Link in the object's VMA list */
-};
-
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 44f3464..03e6179 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -55,6 +55,11 @@
 #define HSW_WB_LLC_AGE0			HSW_CACHEABILITY_CONTROL(0x3)
 #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
 
+static void gen6_ppgtt_map_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 flags);
+static void gen6_ppgtt_unmap_vma(struct i915_vma *vma);
+
 static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
 				      enum i915_cache_level level)
 {
@@ -307,7 +312,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	}
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
+	ppgtt->base.unmap_vma = NULL;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.map_vma = NULL;
 	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -419,6 +426,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			   cache_level);
 }
 
+static void __always_unused gen6_ppgtt_map_vma(struct i915_vma *vma,
+					       enum i915_cache_level cache_level,
+					       u32 flags)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	WARN_ON(flags);
+
+	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+}
+
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
@@ -429,6 +447,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			obj->base.size >> PAGE_SHIFT);
 }
 
+static void __always_unused gen6_ppgtt_unmap_vma(struct i915_vma *vma)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	gen6_ppgtt_clear_range(vma->vm, entry,
+			       vma->obj->base.size >> PAGE_SHIFT);
+}
+
 extern int intel_iommu_gfx_mapped;
 /* Certain Gen5 chipsets require require idling the GPU before
  * unmapping anything from the GTT when VT-d is enabled.
@@ -577,6 +603,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 
 }
 
+static void i915_ggtt_map_vma(struct i915_vma *vma,
+			      enum i915_cache_level cache_level,
+			      u32 unused)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	vma->obj->has_global_gtt_mapping = 1;
+}
+
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
@@ -584,6 +623,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
+static void i915_ggtt_unmap_vma(struct i915_vma *vma)
+{
+	const unsigned int first = vma->node.start >> PAGE_SHIFT;
+	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	vma->obj->has_global_gtt_mapping = 0;
+	intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_map_vma(struct i915_vma *vma,
+			      enum i915_cache_level cache_level,
+			      u32 flags)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
+	 * the global, just use aliasing */
+	if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
+	    !obj->has_global_gtt_mapping) {
+		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+					  vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+		return;
+	}
+
+	gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
+	obj->has_global_gtt_mapping = 1;
+
+	/* If put the mapping in the aliasing PPGTT as well as Global if we have
+	 * aliasing, but the user requested global. */
+	if (dev_priv->mm.aliasing_ppgtt) {
+		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+					  vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+	}
+}
 
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 			      enum i915_cache_level cache_level)
@@ -612,6 +691,23 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 	obj->has_global_gtt_mapping = 0;
 }
 
+static void gen6_ggtt_unmap_vma(struct i915_vma *vma)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	gen6_ggtt_clear_range(vma->vm, entry,
+			      vma->obj->base.size >> PAGE_SHIFT);
+	vma->obj->has_global_gtt_mapping = 0;
+	if (dev_priv->mm.aliasing_ppgtt && vma->obj->has_aliasing_ppgtt_mapping) {
+		gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
+				       entry,
+				       vma->obj->base.size >> PAGE_SHIFT);
+		vma->obj->has_aliasing_ppgtt_mapping = 0;
+	}
+}
+
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -845,7 +941,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
 		DRM_ERROR("Scratch setup failed\n");
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+	dev_priv->gtt.base.unmap_vma = gen6_ggtt_unmap_vma;
 	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+	dev_priv->gtt.base.map_vma = gen6_ggtt_map_vma;
 
 	return ret;
 }
@@ -877,7 +975,9 @@ static int i915_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+	dev_priv->gtt.base.unmap_vma = i915_ggtt_unmap_vma;
 	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+	dev_priv->gtt.base.map_vma = i915_ggtt_map_vma;
 
 	return 0;
 }
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 06/12] drm/i915: Use the new vm [un]bind functions
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (4 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 05/12] drm/i915: Add map/unmap object functions to VM Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-23 16:54   ` Daniel Vetter
  2013-07-22  2:08 ` [PATCH 07/12] drm/i915: eliminate vm->insert_entries() Ben Widawsky
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            | 10 ------
 drivers/gpu/drm/i915/i915_gem.c            | 37 +++++++++------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  7 ++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 ++++++++--------
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 53 ++----------------------------
 5 files changed, 37 insertions(+), 99 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f3f2825..8d6aa34 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1933,18 +1933,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 
 /* i915_gem_gtt.c */
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
-/* FIXME: this is never okay with full PPGTT */
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-				enum i915_cache_level cache_level);
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9ea6424..63297d7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2653,12 +2653,9 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 
 	trace_i915_gem_object_unbind(obj, vm);
 
-	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
-		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+	vma = i915_gem_obj_to_vma(obj, vm);
+	vm->unmap_vma(vma);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -2666,7 +2663,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 	if (i915_is_ggtt(vm))
 		obj->map_and_fenceable = true;
 
-	vma = i915_gem_obj_to_vma(obj, vm);
 	list_del(&vma->mm_list);
 	list_del(&vma->vma_link);
 	drm_mm_remove_node(&vma->node);
@@ -3372,7 +3368,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
 	int ret;
 
@@ -3407,13 +3402,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
-			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-					       obj, cache_level);
-
-		i915_gem_obj_set_color(obj, vma->vm, cache_level);
+		vm->map_vma(vma, cache_level, 0);
+		i915_gem_obj_set_color(obj, vm, cache_level);
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -3695,6 +3685,8 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    bool map_and_fenceable,
 		    bool nonblocking)
 {
+	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
+	struct i915_vma *vma;
 	int ret;
 
 	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
@@ -3702,6 +3694,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 
 	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
 
+	/* FIXME: Use vma for bounds check */
 	if (i915_gem_obj_bound(obj, vm)) {
 		if ((alignment &&
 		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
@@ -3720,20 +3713,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_obj_bound(obj, vm)) {
-		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-
 		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
 						 map_and_fenceable,
 						 nonblocking);
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
-			i915_gem_gtt_bind_object(obj, obj->cache_level);
-	}
+		vma = i915_gem_obj_to_vma(obj, vm);
+		vm->map_vma(vma, obj->cache_level, flags);
+	} else
+		vma = i915_gem_obj_to_vma(obj, vm);
 
+	/* Objects are created map and fenceable. If we bind an object
+	 * the first time, and we had aliasing PPGTT (and didn't request
+	 * GLOBAL), we'll need to do this on the second bind.*/
 	if (!obj->has_global_gtt_mapping && map_and_fenceable)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vm->map_vma(vma, obj->cache_level, GLOBAL_BIND);
 
 	obj->pin_count++;
 	obj->pin_mappable |= map_and_fenceable;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 873577d..cc7c0b4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -417,8 +417,11 @@ static int do_switch(struct i915_hw_context *to)
 		return ret;
 	}
 
-	if (!to->obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
+	if (!to->obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
+							   &dev_priv->gtt.base);
+		vma->vm->map_vma(vma, to->obj->cache_level, GLOBAL_BIND);
+	}
 
 	if (!to->is_initialized || is_default_context(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 8d2643b..6359ef2 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -197,8 +197,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (unlikely(IS_GEN6(dev) &&
 	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
 	    !target_i915_obj->has_global_gtt_mapping)) {
-		i915_gem_gtt_bind_object(target_i915_obj,
-					 target_i915_obj->cache_level);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		vma->vm->map_vma(vma, target_i915_obj->cache_level,
+				 GLOBAL_BIND);
 	}
 
 	/* Validate that the target is in a valid r/w GPU domain */
@@ -404,10 +405,12 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 				   struct i915_address_space *vm,
 				   bool *need_reloc)
 {
-	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
+	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
+		!obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
+	struct i915_vma *vma;
 	int ret;
 
 	need_fence =
@@ -421,6 +424,7 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
+	vma = i915_gem_obj_to_vma(obj, vm);
 	entry->flags |= __EXEC_OBJECT_HAS_PIN;
 
 	if (has_fenced_gpu_access) {
@@ -436,14 +440,6 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		}
 	}
 
-	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
-		obj->has_aliasing_ppgtt_mapping = 1;
-	}
-
 	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
 		entry->offset = i915_gem_obj_offset(obj, vm);
 		*need_reloc = true;
@@ -454,9 +450,7 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
-	    !obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+	vm->map_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
@@ -1047,8 +1041,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but let's be paranoid and do it
 	 * unconditionally for now. */
-	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
-		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
+	if (flags & I915_DISPATCH_SECURE &&
+	    !batch_obj->has_global_gtt_mapping) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
+		vm->map_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
+	}
 
 	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->objects);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 03e6179..1de49a0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -414,18 +414,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	struct i915_address_space *vm = &ppgtt->base;
-	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
-
-	vm->insert_entries(vm, obj->pages,
-			   obj_offset >> PAGE_SHIFT,
-			   cache_level);
-}
-
 static void __always_unused gen6_ppgtt_map_vma(struct i915_vma *vma,
 					       enum i915_cache_level cache_level,
 					       u32 flags)
@@ -437,16 +425,6 @@ static void __always_unused gen6_ppgtt_map_vma(struct i915_vma *vma,
 	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
 }
 
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	struct i915_address_space *vm = &ppgtt->base;
-	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
-
-	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
-			obj->base.size >> PAGE_SHIFT);
-}
-
 static void __always_unused gen6_ppgtt_unmap_vma(struct i915_vma *vma)
 {
 	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
@@ -507,8 +485,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
 		i915_gem_clflush_object(obj);
-		i915_gem_gtt_bind_object(obj, obj->cache_level);
+		vma->vm->map_vma(vma, obj->cache_level, 0);
 	}
 
 	i915_gem_chipset_flush(dev);
@@ -664,33 +644,6 @@ static void gen6_ggtt_map_vma(struct i915_vma *vma,
 	}
 }
 
-void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
-			      enum i915_cache_level cache_level)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
-					  entry,
-					  cache_level);
-
-	obj->has_global_gtt_mapping = 1;
-}
-
-void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
-{
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
-
-	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
-				       entry,
-				       obj->base.size >> PAGE_SHIFT);
-
-	obj->has_global_gtt_mapping = 0;
-}
-
 static void gen6_ggtt_unmap_vma(struct i915_vma *vma)
 {
 	struct drm_device *dev = vma->vm->dev;
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 07/12] drm/i915: eliminate vm->insert_entries()
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (5 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 06/12] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-23 16:57   ` Daniel Vetter
  2013-07-22  2:08 ` [PATCH 08/12] drm/i915: Add vma to list at creation Ben Widawsky
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1de49a0..5c04887 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -315,7 +315,6 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.unmap_vma = NULL;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
 	ppgtt->base.map_vma = NULL;
-	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
 	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
@@ -570,19 +569,6 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 	readl(gtt_base);
 }
 
-
-static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     unsigned int pg_start,
-				     enum i915_cache_level cache_level)
-{
-	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
-		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
-
-	intel_gtt_insert_sg_entries(st, pg_start, flags);
-
-}
-
 static void i915_ggtt_map_vma(struct i915_vma *vma,
 			      enum i915_cache_level cache_level,
 			      u32 unused)
@@ -895,7 +881,6 @@ static int gen6_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
 	dev_priv->gtt.base.unmap_vma = gen6_ggtt_unmap_vma;
-	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
 	dev_priv->gtt.base.map_vma = gen6_ggtt_map_vma;
 
 	return ret;
@@ -929,7 +914,6 @@ static int i915_gmch_probe(struct drm_device *dev,
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
 	dev_priv->gtt.base.unmap_vma = i915_ggtt_unmap_vma;
-	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
 	dev_priv->gtt.base.map_vma = i915_ggtt_map_vma;
 
 	return 0;
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 08/12] drm/i915: Add vma to list at creation
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (6 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 07/12] drm/i915: eliminate vm->insert_entries() Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-22  2:08 ` [PATCH 09/12] drm/i915: create vmas at execbuf Ben Widawsky
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With the current code there shouldn't be a distinction - however with an
upcoming change we intend to allocate a vma much earlier, before it's
actually bound anywhere.

To do this we have to check node allocation as well for the _bound()
check.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 63297d7..a6dc653 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3190,12 +3190,6 @@ search_free:
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&vma->mm_list, &vm->inactive_list);
 
-	/* Keep GGTT vmas first to make debug easier */
-	if (i915_is_ggtt(vm))
-		list_add(&vma->vma_link, &obj->vma_list);
-	else
-		list_add_tail(&vma->vma_link, &obj->vma_list);
-
 	fenceable =
 		i915_is_ggtt(vm) &&
 		i915_gem_obj_ggtt_size(obj) == fence_size &&
@@ -4069,6 +4063,12 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 	vma->vm = vm;
 	vma->obj = obj;
 
+	/* Keep GGTT vmas first to make debug easier */
+	if (i915_is_ggtt(vm))
+		list_add(&vma->vma_link, &obj->vma_list);
+	else
+		list_add_tail(&vma->vma_link, &obj->vma_list);
+
 	return vma;
 }
 
@@ -4767,7 +4767,7 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
 	struct i915_vma *vma;
 
 	list_for_each_entry(vma, &o->vma_list, vma_link)
-		if (vma->vm == vm)
+		if (vma->vm == vm && drm_mm_node_allocated(&vma->node))
 			return true;
 
 	return false;
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 09/12] drm/i915: create vmas at execbuf
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (7 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 08/12] drm/i915: Add vma to list at creation Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-22 13:32   ` Chris Wilson
  2013-07-22  2:08 ` [PATCH 10/12] drm/i915: Convert execbuf code to use vmas Ben Widawsky
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

In order to transition more of our code over to using a VMA instead of
an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up
until now, we've only had a VMA when actually binding an object.

The previous patch helped handle the distinction on bound vs. unbound.
This patch will help us catch leaks, and other issues before we actually
shuffle a bunch of stuff around.

The subsequent patch to fix up the rest of execbuf should be mostly just
moving code around, and this is the major functional change.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  3 +++
 drivers/gpu/drm/i915/i915_gem.c            | 26 ++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 ++++++++--
 3 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8d6aa34..59a8c03 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1867,6 +1867,9 @@ void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
 			    enum i915_cache_level color);
 struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm);
+struct i915_vma *
+i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
+				  struct i915_address_space *vm);
 /* Some GGTT VM helpers */
 #define obj_to_ggtt(obj) \
 	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a6dc653..0fa6667 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3111,9 +3111,6 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	struct i915_vma *vma;
 	int ret;
 
-	if (WARN_ON(!list_empty(&obj->vma_list)))
-		return -EBUSY;
-
 	BUG_ON(!i915_is_ggtt(vm));
 
 	fence_size = i915_gem_get_gtt_size(dev,
@@ -3154,15 +3151,15 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	/* For now we only ever use 1 vma per object */
-	WARN_ON(!list_empty(&obj->vma_list));
-
-	vma = i915_gem_vma_create(obj, vm);
+	vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
 	if (IS_ERR(vma)) {
 		i915_gem_object_unpin_pages(obj);
 		return PTR_ERR(vma);
 	}
 
+	/* For now we only ever use 1 vma per object */
+	WARN_ON(!list_is_singular(&obj->vma_list));
+
 search_free:
 	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
 						  size, alignment,
@@ -4054,7 +4051,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm)
 {
-	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
+	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_ATOMIC);
 	if (vma == NULL)
 		return ERR_PTR(-ENOMEM);
 
@@ -4829,3 +4826,16 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 
 	return NULL;
 }
+
+struct i915_vma *
+i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
+				  struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+
+	vma = i915_gem_obj_to_vma(obj, vm);
+	if (!vma)
+		vma = i915_gem_vma_create(obj, vm);
+
+	return vma;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 6359ef2..1f82a04 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -85,12 +85,14 @@ static int
 eb_lookup_objects(struct eb_objects *eb,
 		  struct drm_i915_gem_exec_object2 *exec,
 		  const struct drm_i915_gem_execbuffer2 *args,
+		  struct i915_address_space *vm,
 		  struct drm_file *file)
 {
 	int i;
 
 	spin_lock(&file->table_lock);
 	for (i = 0; i < args->buffer_count; i++) {
+		struct i915_vma *vma;
 		struct drm_i915_gem_object *obj;
 
 		obj = to_intel_bo(idr_find(&file->object_idr, exec[i].handle));
@@ -111,6 +113,10 @@ eb_lookup_objects(struct eb_objects *eb,
 		drm_gem_object_reference(&obj->base);
 		list_add_tail(&obj->exec_list, &eb->objects);
 
+		vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
+		if (IS_ERR(vma))
+			return PTR_ERR(vma);
+
 		obj->exec_entry = &exec[i];
 		if (eb->and < 0) {
 			eb->lut[i] = obj;
@@ -666,7 +672,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 
 	/* reacquire the objects */
 	eb_reset(eb);
-	ret = eb_lookup_objects(eb, exec, args, file);
+	ret = eb_lookup_objects(eb, exec, args, vm, file);
 	if (ret)
 		goto err;
 
@@ -1001,7 +1007,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	}
 
 	/* Look up object handles */
-	ret = eb_lookup_objects(eb, exec, args, file);
+	ret = eb_lookup_objects(eb, exec, args, vm, file);
 	if (ret)
 		goto err;
 
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 10/12] drm/i915: Convert execbuf code to use vmas
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (8 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 09/12] drm/i915: create vmas at execbuf Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-22  2:08 ` [PATCH 11/12] drm/i915: Convert object coloring to VMA Ben Widawsky
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This attempts to convert all the execbuf code to speak in vmas. Since
the execbuf code is very self contained it was a nice isolated
conversion.

The meat of the code is about turning eb_objects into eb_vma, and then
wiring up the rest of the code to use vmas instead of obj, vm pairs.

Unfortunately, to do this, we must move the exec_list link from the obj
structure. This list is reused in the shrinker code, so we must also
modify the shrinker code to make this work.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  20 +-
 drivers/gpu/drm/i915/i915_gem.c            |   2 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  31 ++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 295 ++++++++++++++---------------
 4 files changed, 171 insertions(+), 177 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 59a8c03..fe41a3d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -465,6 +465,17 @@ struct i915_vma {
 	struct list_head mm_list;
 
 	struct list_head vma_link; /* Link in the object's VMA list */
+
+	/** This vma's place in the batchbuffer or on the eviction list */
+	struct list_head exec_list;
+
+	/**
+	 * Used for performing relocations during execbuffer insertion.
+	 */
+	struct hlist_node exec_node;
+	unsigned long exec_handle;
+	struct drm_i915_gem_exec_object2 *exec_entry;
+
 };
 
 struct i915_address_space {
@@ -1278,8 +1289,6 @@ struct drm_i915_gem_object {
 	struct list_head global_list;
 
 	struct list_head ring_list;
-	/** This object's place in the batchbuffer or on the eviction list */
-	struct list_head exec_list;
 
 	/**
 	 * This is set if the object has been written to since last bound
@@ -1357,13 +1366,6 @@ struct drm_i915_gem_object {
 	void *dma_buf_vmapping;
 	int vmapping_count;
 
-	/**
-	 * Used for performing relocations during execbuffer insertion.
-	 */
-	struct hlist_node exec_node;
-	unsigned long exec_handle;
-	struct drm_i915_gem_exec_object2 *exec_entry;
-
 	struct intel_ring_buffer *ring;
 
 	/** Breadcrumb of last rendering to the buffer. */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0fa6667..397a4b4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3922,7 +3922,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 {
 	INIT_LIST_HEAD(&obj->global_list);
 	INIT_LIST_HEAD(&obj->ring_list);
-	INIT_LIST_HEAD(&obj->exec_list);
 	INIT_LIST_HEAD(&obj->vma_list);
 
 	obj->ops = ops;
@@ -4057,6 +4056,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 
 	INIT_LIST_HEAD(&vma->vma_link);
 	INIT_LIST_HEAD(&vma->mm_list);
+	INIT_LIST_HEAD(&vma->exec_list);
 	vma->vm = vm;
 	vma->obj = obj;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 18a44a9..c860c5b 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -37,7 +37,7 @@ mark_free(struct i915_vma *vma, struct list_head *unwind)
 	if (vma->obj->pin_count)
 		return false;
 
-	list_add(&vma->obj->exec_list, unwind);
+	list_add(&vma->exec_list, unwind);
 	return drm_mm_scan_add_block(&vma->node);
 }
 
@@ -49,7 +49,6 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct list_head eviction_list, unwind_list;
 	struct i915_vma *vma;
-	struct drm_i915_gem_object *obj;
 	int ret = 0;
 
 	trace_i915_gem_evict(dev, min_size, alignment, mappable);
@@ -104,14 +103,13 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 none:
 	/* Nothing found, clean up and bail out! */
 	while (!list_empty(&unwind_list)) {
-		obj = list_first_entry(&unwind_list,
-				       struct drm_i915_gem_object,
+		vma = list_first_entry(&unwind_list,
+				       struct i915_vma,
 				       exec_list);
-		vma = i915_gem_obj_to_vma(obj, vm);
 		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
-		list_del_init(&obj->exec_list);
+		list_del_init(&vma->exec_list);
 	}
 
 	/* We expect the caller to unpin, evict all and try again, or give up.
@@ -125,28 +123,27 @@ found:
 	 * temporary list. */
 	INIT_LIST_HEAD(&eviction_list);
 	while (!list_empty(&unwind_list)) {
-		obj = list_first_entry(&unwind_list,
-				       struct drm_i915_gem_object,
+		vma = list_first_entry(&unwind_list,
+				       struct i915_vma,
 				       exec_list);
-		vma = i915_gem_obj_to_vma(obj, vm);
 		if (drm_mm_scan_remove_block(&vma->node)) {
-			list_move(&obj->exec_list, &eviction_list);
-			drm_gem_object_reference(&obj->base);
+			list_move(&vma->exec_list, &eviction_list);
+			drm_gem_object_reference(&vma->obj->base);
 			continue;
 		}
-		list_del_init(&obj->exec_list);
+		list_del_init(&vma->exec_list);
 	}
 
 	/* Unbinding will emit any required flushes */
 	while (!list_empty(&eviction_list)) {
-		obj = list_first_entry(&eviction_list,
-				       struct drm_i915_gem_object,
+		vma = list_first_entry(&eviction_list,
+				       struct i915_vma,
 				       exec_list);
 		if (ret == 0)
-			ret = i915_gem_object_unbind(obj, vm);
+			ret = i915_gem_object_unbind(vma->obj, vm);
 
-		list_del_init(&obj->exec_list);
-		drm_gem_object_unreference(&obj->base);
+		list_del_init(&vma->exec_list);
+		drm_gem_object_unreference(&vma->obj->base);
 	}
 
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 1f82a04..75325c9 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -33,24 +33,24 @@
 #include "intel_drv.h"
 #include <linux/dma_remapping.h>
 
-struct eb_objects {
-	struct list_head objects;
+struct eb_vmas {
+	struct list_head vmas;
 	int and;
 	union {
-		struct drm_i915_gem_object *lut[0];
+		struct i915_vma *lut[0];
 		struct hlist_head buckets[0];
 	};
 };
 
-static struct eb_objects *
-eb_create(struct drm_i915_gem_execbuffer2 *args)
+static struct eb_vmas *
+eb_create(struct drm_i915_gem_execbuffer2 *args, struct i915_address_space *vm)
 {
-	struct eb_objects *eb = NULL;
+	struct eb_vmas *eb = NULL;
 
 	if (args->flags & I915_EXEC_HANDLE_LUT) {
 		int size = args->buffer_count;
-		size *= sizeof(struct drm_i915_gem_object *);
-		size += sizeof(struct eb_objects);
+		size *= sizeof(struct i915_vma *);
+		size += sizeof(struct eb_vmas);
 		eb = kmalloc(size, GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
 	}
 
@@ -61,7 +61,7 @@ eb_create(struct drm_i915_gem_execbuffer2 *args)
 		while (count > 2*size)
 			count >>= 1;
 		eb = kzalloc(count*sizeof(struct hlist_head) +
-			     sizeof(struct eb_objects),
+			     sizeof(struct eb_vmas),
 			     GFP_TEMPORARY);
 		if (eb == NULL)
 			return eb;
@@ -70,23 +70,23 @@ eb_create(struct drm_i915_gem_execbuffer2 *args)
 	} else
 		eb->and = -args->buffer_count;
 
-	INIT_LIST_HEAD(&eb->objects);
+	INIT_LIST_HEAD(&eb->vmas);
 	return eb;
 }
 
 static void
-eb_reset(struct eb_objects *eb)
+eb_reset(struct eb_vmas *eb)
 {
 	if (eb->and >= 0)
 		memset(eb->buckets, 0, (eb->and+1)*sizeof(struct hlist_head));
 }
 
 static int
-eb_lookup_objects(struct eb_objects *eb,
-		  struct drm_i915_gem_exec_object2 *exec,
-		  const struct drm_i915_gem_execbuffer2 *args,
-		  struct i915_address_space *vm,
-		  struct drm_file *file)
+eb_lookup_vmas(struct eb_vmas *eb,
+	       struct drm_i915_gem_exec_object2 *exec,
+	       const struct drm_i915_gem_execbuffer2 *args,
+	       struct i915_address_space *vm,
+	       struct drm_file *file)
 {
 	int i;
 
@@ -103,7 +103,11 @@ eb_lookup_objects(struct eb_objects *eb,
 			return -ENOENT;
 		}
 
-		if (!list_empty(&obj->exec_list)) {
+		vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
+		if (IS_ERR(vma))
+			return PTR_ERR(vma);
+
+		if (!list_empty(&vma->exec_list)) {
 			spin_unlock(&file->table_lock);
 			DRM_DEBUG("Object %p [handle %d, index %d] appears more than once in object list\n",
 				   obj, exec[i].handle, i);
@@ -111,19 +115,16 @@ eb_lookup_objects(struct eb_objects *eb,
 		}
 
 		drm_gem_object_reference(&obj->base);
-		list_add_tail(&obj->exec_list, &eb->objects);
+		list_add_tail(&vma->exec_list, &eb->vmas);
 
-		vma = i915_gem_obj_lookup_or_create_vma(obj, vm);
-		if (IS_ERR(vma))
-			return PTR_ERR(vma);
+		vma->exec_entry = &exec[i];
 
-		obj->exec_entry = &exec[i];
 		if (eb->and < 0) {
-			eb->lut[i] = obj;
+			eb->lut[i] = vma;
 		} else {
 			uint32_t handle = args->flags & I915_EXEC_HANDLE_LUT ? i : exec[i].handle;
-			obj->exec_handle = handle;
-			hlist_add_head(&obj->exec_node,
+			vma->exec_handle = handle;
+			hlist_add_head(&vma->exec_node,
 				       &eb->buckets[handle & eb->and]);
 		}
 	}
@@ -132,8 +133,7 @@ eb_lookup_objects(struct eb_objects *eb,
 	return 0;
 }
 
-static struct drm_i915_gem_object *
-eb_get_object(struct eb_objects *eb, unsigned long handle)
+static struct i915_vma *eb_get_vma(struct eb_vmas *eb, unsigned long handle)
 {
 	if (eb->and < 0) {
 		if (handle >= -eb->and)
@@ -145,27 +145,25 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
 
 		head = &eb->buckets[handle & eb->and];
 		hlist_for_each(node, head) {
-			struct drm_i915_gem_object *obj;
+			struct i915_vma *vma;
 
-			obj = hlist_entry(node, struct drm_i915_gem_object, exec_node);
-			if (obj->exec_handle == handle)
-				return obj;
+			vma = hlist_entry(node, struct i915_vma, exec_node);
+			if (vma->exec_handle == handle)
+				return vma;
 		}
 		return NULL;
 	}
 }
 
-static void
-eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
-{
-	while (!list_empty(&eb->objects)) {
-		struct drm_i915_gem_object *obj;
+static void eb_destroy(struct eb_vmas *eb) {
+	while (!list_empty(&eb->vmas)) {
+		struct i915_vma *vma;
 
-		obj = list_first_entry(&eb->objects,
-				       struct drm_i915_gem_object,
+		vma = list_first_entry(&eb->vmas,
+				       struct i915_vma,
 				       exec_list);
-		list_del_init(&obj->exec_list);
-		drm_gem_object_unreference(&obj->base);
+		list_del_init(&vma->exec_list);
+		drm_gem_object_unreference(&vma->obj->base);
 	}
 	kfree(eb);
 }
@@ -179,22 +177,24 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
 
 static int
 i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
-				   struct eb_objects *eb,
+				   struct eb_vmas *eb,
 				   struct drm_i915_gem_relocation_entry *reloc,
 				   struct i915_address_space *vm)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_gem_object *target_obj;
 	struct drm_i915_gem_object *target_i915_obj;
+	struct i915_vma *target_vma;
 	uint32_t target_offset;
 	int ret = -EINVAL;
 
 	/* we've already hold a reference to all valid objects */
-	target_obj = &eb_get_object(eb, reloc->target_handle)->base;
-	if (unlikely(target_obj == NULL))
+	target_vma = eb_get_vma(eb, reloc->target_handle);
+	if (unlikely(target_vma == NULL))
 		return -ENOENT;
+	target_i915_obj = target_vma->obj;
+	target_obj = &target_vma->obj->base;
 
-	target_i915_obj = to_intel_bo(target_obj);
 	target_offset = i915_gem_obj_ggtt_offset(target_i915_obj);
 
 	/* Sandybridge PPGTT errata: We need a global gtt mapping for MI and
@@ -304,14 +304,13 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
-				    struct eb_objects *eb,
-				    struct i915_address_space *vm)
+i915_gem_execbuffer_relocate_vma(struct i915_vma *vma,
+				 struct eb_vmas *eb)
 {
 #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
 	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
 	struct drm_i915_gem_relocation_entry __user *user_relocs;
-	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	int remain, ret;
 
 	user_relocs = to_user_ptr(entry->relocs_ptr);
@@ -330,8 +329,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 		do {
 			u64 offset = r->presumed_offset;
 
-			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
-								 vm);
+			ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, r,
+								 vma->vm);
 			if (ret)
 				return ret;
 
@@ -352,17 +351,16 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
-					 struct eb_objects *eb,
-					 struct drm_i915_gem_relocation_entry *relocs,
-					 struct i915_address_space *vm)
+i915_gem_execbuffer_relocate_vma_slow(struct i915_vma *vma,
+				      struct eb_vmas *eb,
+				      struct drm_i915_gem_relocation_entry *relocs)
 {
-	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+	const struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	int i, ret;
 
 	for (i = 0; i < entry->relocation_count; i++) {
-		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
-							 vm);
+		ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, &relocs[i],
+							 vma->vm);
 		if (ret)
 			return ret;
 	}
@@ -371,10 +369,10 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate(struct eb_objects *eb,
+i915_gem_execbuffer_relocate(struct eb_vmas *eb,
 			     struct i915_address_space *vm)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	int ret = 0;
 
 	/* This is the fast path and we cannot handle a pagefault whilst
@@ -385,8 +383,8 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb,
 	 * lockdep complains vehemently.
 	 */
 	pagefault_disable();
-	list_for_each_entry(obj, &eb->objects, exec_list) {
-		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
+	list_for_each_entry(vma, &eb->vmas, exec_list) {
+		ret = i915_gem_execbuffer_relocate_vma(vma, eb);
 		if (ret)
 			break;
 	}
@@ -399,38 +397,36 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb,
 #define  __EXEC_OBJECT_HAS_FENCE (1<<30)
 
 static int
-need_reloc_mappable(struct drm_i915_gem_object *obj)
+need_reloc_mappable(struct i915_vma *vma)
 {
-	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
-	return entry->relocation_count && !use_cpu_reloc(obj);
+	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
+	return entry->relocation_count && !use_cpu_reloc(vma->obj);
 }
 
 static int
-i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
-				   struct intel_ring_buffer *ring,
-				   struct i915_address_space *vm,
-				   bool *need_reloc)
+i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
+				struct intel_ring_buffer *ring,
+				bool *need_reloc)
 {
-	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	bool need_fence, need_mappable;
 	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
-		!obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
-	struct i915_vma *vma;
+		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
+	struct drm_i915_gem_object *obj = vma->obj;
 	int ret;
 
 	need_fence =
 		has_fenced_gpu_access &&
 		entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 		obj->tiling_mode != I915_TILING_NONE;
-	need_mappable = need_fence || need_reloc_mappable(obj);
+	need_mappable = need_fence || need_reloc_mappable(vma);
 
-	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
+	ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, need_mappable,
 				  false);
 	if (ret)
 		return ret;
 
-	vma = i915_gem_obj_to_vma(obj, vm);
 	entry->flags |= __EXEC_OBJECT_HAS_PIN;
 
 	if (has_fenced_gpu_access) {
@@ -446,8 +442,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		}
 	}
 
-	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
-		entry->offset = i915_gem_obj_offset(obj, vm);
+	if (entry->offset != vma->node.start) {
+		entry->offset = vma->node.start;
 		*need_reloc = true;
 	}
 
@@ -456,67 +452,66 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
 	}
 
-	vm->map_vma(vma, obj->cache_level, flags);
+	vma->vm->map_vma(vma, obj->cache_level, flags);
 
 	return 0;
 }
 
 static void
-i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
+i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma)
 {
 	struct drm_i915_gem_exec_object2 *entry;
 
-	if (!i915_gem_obj_bound_any(obj))
+	if (!drm_mm_node_allocated(&vma->node))
 		return;
 
-	entry = obj->exec_entry;
+	entry = vma->exec_entry;
 
 	if (entry->flags & __EXEC_OBJECT_HAS_FENCE)
-		i915_gem_object_unpin_fence(obj);
+		i915_gem_object_unpin_fence(vma->obj);
 
 	if (entry->flags & __EXEC_OBJECT_HAS_PIN)
-		i915_gem_object_unpin(obj);
+		i915_gem_object_unpin(vma->obj);
 
 	entry->flags &= ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN);
 }
 
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
-			    struct list_head *objects,
-			    struct i915_address_space *vm,
+			    struct list_head *vmas,
 			    bool *need_relocs)
 {
 	struct drm_i915_gem_object *obj;
-	struct list_head ordered_objects;
+	struct i915_vma *vma;
+	struct list_head ordered_vmas;
 	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
 	int retry;
 
-	INIT_LIST_HEAD(&ordered_objects);
-	while (!list_empty(objects)) {
+	INIT_LIST_HEAD(&ordered_vmas);
+	while (!list_empty(vmas)) {
 		struct drm_i915_gem_exec_object2 *entry;
 		bool need_fence, need_mappable;
 
-		obj = list_first_entry(objects,
-				       struct drm_i915_gem_object,
-				       exec_list);
-		entry = obj->exec_entry;
+		vma = list_first_entry(vmas, struct i915_vma, exec_list);
+		obj = vma->obj;
+		entry = vma->exec_entry;
 
 		need_fence =
 			has_fenced_gpu_access &&
 			entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 			obj->tiling_mode != I915_TILING_NONE;
-		need_mappable = need_fence || need_reloc_mappable(obj);
+		need_mappable = need_fence || need_reloc_mappable(vma);
 
 		if (need_mappable)
-			list_move(&obj->exec_list, &ordered_objects);
+			list_move(&vma->exec_list, &ordered_vmas);
 		else
-			list_move_tail(&obj->exec_list, &ordered_objects);
+			list_move_tail(&vma->exec_list, &ordered_vmas);
 
 		obj->base.pending_read_domains = I915_GEM_GPU_DOMAINS & ~I915_GEM_DOMAIN_COMMAND;
 		obj->base.pending_write_domain = 0;
 		obj->pending_fenced_gpu_access = false;
 	}
-	list_splice(&ordered_objects, objects);
+	list_splice(&ordered_vmas, vmas);
 
 	/* Attempt to pin all of the buffers into the GTT.
 	 * This is done in 3 phases:
@@ -535,47 +530,47 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 		int ret = 0;
 
 		/* Unbind any ill-fitting objects or pin. */
-		list_for_each_entry(obj, objects, exec_list) {
-			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+		list_for_each_entry(vma, vmas, exec_list) {
+			struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
 			bool need_fence, need_mappable;
-			u32 obj_offset;
 
-			if (!i915_gem_obj_bound(obj, vm))
+			obj = vma->obj;
+
+			if (!drm_mm_node_allocated(&vma->node))
 				continue;
 
-			obj_offset = i915_gem_obj_offset(obj, vm);
 			need_fence =
 				has_fenced_gpu_access &&
 				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 				obj->tiling_mode != I915_TILING_NONE;
-			need_mappable = need_fence || need_reloc_mappable(obj);
+			need_mappable = need_fence || need_reloc_mappable(vma);
 
 			BUG_ON((need_mappable || need_fence) &&
-			       !i915_is_ggtt(vm));
+			       !i915_is_ggtt(vma->vm));
 
 			if ((entry->alignment &&
-			     obj_offset & (entry->alignment - 1)) ||
+			     vma->node.start & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
-				ret = i915_gem_object_unbind(obj, vm);
+				ret = i915_gem_object_unbind(obj, vma->vm);
 			else
-				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
+				ret = i915_gem_execbuffer_reserve_vma(vma, ring, need_relocs);
 			if (ret)
 				goto err;
 		}
 
 		/* Bind fresh objects */
-		list_for_each_entry(obj, objects, exec_list) {
-			if (i915_gem_obj_bound(obj, vm))
+		list_for_each_entry(vma, vmas, exec_list) {
+			if (drm_mm_node_allocated(&vma->node))
 				continue;
 
-			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
+			ret = i915_gem_execbuffer_reserve_vma(vma, ring, need_relocs);
 			if (ret)
 				goto err;
 		}
 
 err:		/* Decrement pin count for bound objects */
-		list_for_each_entry(obj, objects, exec_list)
-			i915_gem_execbuffer_unreserve_object(obj);
+		list_for_each_entry(vma, vmas, exec_list)
+			i915_gem_execbuffer_unreserve_vma(vma);
 
 		if (ret != -ENOSPC || retry++)
 			return ret;
@@ -591,24 +586,27 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 				  struct drm_i915_gem_execbuffer2 *args,
 				  struct drm_file *file,
 				  struct intel_ring_buffer *ring,
-				  struct eb_objects *eb,
-				  struct drm_i915_gem_exec_object2 *exec,
-				  struct i915_address_space *vm)
+				  struct eb_vmas *eb,
+				  struct drm_i915_gem_exec_object2 *exec)
 {
 	struct drm_i915_gem_relocation_entry *reloc;
-	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	bool need_relocs;
 	int *reloc_offset;
 	int i, total, ret;
 	int count = args->buffer_count;
 
+	if (WARN_ON(list_empty(&eb->vmas)))
+		return 0;
+
+	vm = list_first_entry(&eb->vmas, struct i915_vma, exec_list)->vm;
+
 	/* We may process another execbuffer during the unlock... */
-	while (!list_empty(&eb->objects)) {
-		obj = list_first_entry(&eb->objects,
-				       struct drm_i915_gem_object,
-				       exec_list);
-		list_del_init(&obj->exec_list);
-		drm_gem_object_unreference(&obj->base);
+	while (!list_empty(&eb->vmas)) {
+		vma = list_first_entry(&eb->vmas, struct i915_vma, exec_list);
+		list_del_init(&vma->exec_list);
+		drm_gem_object_unreference(&vma->obj->base);
 	}
 
 	mutex_unlock(&dev->struct_mutex);
@@ -672,20 +670,19 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 
 	/* reacquire the objects */
 	eb_reset(eb);
-	ret = eb_lookup_objects(eb, exec, args, vm, file);
+	ret = eb_lookup_vmas(eb, exec, args, vm, file);
 	if (ret)
 		goto err;
 
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->vmas, &need_relocs);
 	if (ret)
 		goto err;
 
-	list_for_each_entry(obj, &eb->objects, exec_list) {
-		int offset = obj->exec_entry - exec;
-		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
-							       reloc + reloc_offset[offset],
-							       vm);
+	list_for_each_entry(vma, &eb->vmas, exec_list) {
+		int offset = vma->exec_entry - exec;
+		ret = i915_gem_execbuffer_relocate_vma_slow(vma, eb,
+							    reloc + reloc_offset[offset]);
 		if (ret)
 			goto err;
 	}
@@ -704,21 +701,21 @@ err:
 
 static int
 i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring,
-				struct list_head *objects)
+				struct list_head *vmas)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	uint32_t flush_domains = 0;
 	int ret;
 
-	list_for_each_entry(obj, objects, exec_list) {
-		ret = i915_gem_object_sync(obj, ring);
+	list_for_each_entry(vma, vmas, exec_list) {
+		ret = i915_gem_object_sync(vma->obj, ring);
 		if (ret)
 			return ret;
 
-		if (obj->base.write_domain & I915_GEM_DOMAIN_CPU)
-			i915_gem_clflush_object(obj);
+		if (vma->obj->base.write_domain & I915_GEM_DOMAIN_CPU)
+			i915_gem_clflush_object(vma->obj);
 
-		flush_domains |= obj->base.write_domain;
+		flush_domains |= vma->obj->base.write_domain;
 	}
 
 	if (flush_domains & I915_GEM_DOMAIN_CPU)
@@ -785,13 +782,13 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
 }
 
 static void
-i915_gem_execbuffer_move_to_active(struct list_head *objects,
-				   struct i915_address_space *vm,
+i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 				   struct intel_ring_buffer *ring)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 
-	list_for_each_entry(obj, objects, exec_list) {
+	list_for_each_entry(vma, vmas, exec_list) {
+		struct drm_i915_gem_object *obj = vma->obj;
 		u32 old_read = obj->base.read_domains;
 		u32 old_write = obj->base.write_domain;
 
@@ -801,7 +798,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
 		obj->base.read_domains = obj->base.pending_read_domains;
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
-		i915_gem_object_move_to_active(obj, vm, ring);
+		i915_gem_object_move_to_active(obj, vma->vm, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
@@ -859,7 +856,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct eb_objects *eb;
+	struct eb_vmas *eb;
 	struct drm_i915_gem_object *batch_obj;
 	struct drm_clip_rect *cliprects = NULL;
 	struct intel_ring_buffer *ring;
@@ -999,7 +996,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		goto pre_mutex_err;
 	}
 
-	eb = eb_create(args);
+	eb = eb_create(args, vm);
 	if (eb == NULL) {
 		mutex_unlock(&dev->struct_mutex);
 		ret = -ENOMEM;
@@ -1007,18 +1004,16 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	}
 
 	/* Look up object handles */
-	ret = eb_lookup_objects(eb, exec, args, vm, file);
+	ret = eb_lookup_vmas(eb, exec, args, vm, file);
 	if (ret)
 		goto err;
 
 	/* take note of the batch buffer before we might reorder the lists */
-	batch_obj = list_entry(eb->objects.prev,
-			       struct drm_i915_gem_object,
-			       exec_list);
+	batch_obj = list_entry(eb->vmas.prev, struct i915_vma, exec_list)->obj;
 
 	/* Move the objects en-masse into the GTT, evicting if necessary. */
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->vmas, &need_relocs);
 	if (ret)
 		goto err;
 
@@ -1028,7 +1023,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	if (ret) {
 		if (ret == -EFAULT) {
 			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
-								eb, exec, vm);
+								eb, exec);
 			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 		}
 		if (ret)
@@ -1053,7 +1048,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		vm->map_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
 	}
 
-	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->objects);
+	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
 	if (ret)
 		goto err;
 
@@ -1108,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
-	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
+	i915_gem_execbuffer_move_to_active(&eb->vmas, ring);
 	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
 
 err:
-	eb_destroy(eb, vm);
+	eb_destroy(eb);
 
 	mutex_unlock(&dev->struct_mutex);
 
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 11/12] drm/i915: Convert object coloring to VMA
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (9 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 10/12] drm/i915: Convert execbuf code to use vmas Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-23 17:07   ` Daniel Vetter
  2013-07-22  2:08 ` [PATCH 12/12] drm/i915: Convert active API " Ben Widawsky
  2013-07-22 10:42 ` [PATCH 00/12] Completion of i915 VMAs Chris Wilson
  12 siblings, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h |  3 ---
 drivers/gpu/drm/i915/i915_gem.c | 18 +-----------------
 2 files changed, 1 insertion(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fe41a3d..2b4f30c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1864,9 +1864,6 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
 			struct i915_address_space *vm);
 unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
 				struct i915_address_space *vm);
-void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
-			    struct i915_address_space *vm,
-			    enum i915_cache_level color);
 struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm);
 struct i915_vma *
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 397a4b4..e038709 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3394,7 +3394,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		}
 
 		vm->map_vma(vma, cache_level, 0);
-		i915_gem_obj_set_color(obj, vm, cache_level);
+		vma->node.color = cache_level;
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -4800,22 +4800,6 @@ unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
 	return 0;
 }
 
-void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
-			    struct i915_address_space *vm,
-			    enum i915_cache_level color)
-{
-	struct i915_vma *vma;
-	BUG_ON(list_empty(&o->vma_list));
-	list_for_each_entry(vma, &o->vma_list, vma_link) {
-		if (vma->vm == vm) {
-			vma->node.color = color;
-			return;
-		}
-	}
-
-	WARN(1, "Couldn't set color for VM %p\n", vm);
-}
-
 struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm)
 {
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 12/12] drm/i915: Convert active API to VMA
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (10 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 11/12] drm/i915: Convert object coloring to VMA Ben Widawsky
@ 2013-07-22  2:08 ` Ben Widawsky
  2013-07-22 10:42 ` [PATCH 00/12] Completion of i915 VMAs Chris Wilson
  12 siblings, 0 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22  2:08 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  5 ++---
 drivers/gpu/drm/i915/i915_gem.c            | 14 ++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c    |  5 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 4 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2b4f30c..8850730 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1738,9 +1738,8 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
 bool i915_gem_object_is_active(struct drm_i915_gem_object *obj);
-void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
-				    struct i915_address_space *vm,
-				    struct intel_ring_buffer *ring);
+void i915_gem_vma_move_to_active(struct i915_vma *vma,
+				 struct intel_ring_buffer *ring);
 
 int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e038709..6ff9040 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1879,14 +1879,13 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 }
 
 void
-i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
-			       struct i915_address_space *vm,
-			       struct intel_ring_buffer *ring)
+i915_gem_vma_move_to_active(struct i915_vma *vma,
+			    struct intel_ring_buffer *ring)
 {
-	struct drm_device *dev = obj->base.dev;
+	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 seqno = intel_ring_get_seqno(ring);
-	struct i915_vma *vma;
+	struct drm_i915_gem_object *obj = vma->obj;
 
 	BUG_ON(ring == NULL);
 	if (obj->ring != ring && obj->last_write_seqno) {
@@ -1895,15 +1894,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 	obj->ring = ring;
 
-	/* Move from whatever list we were on to the tail of execution. */
-	vma = i915_gem_obj_to_vma(obj, vm);
 	/* Add a reference if we're newly entering the active list. */
 	if (!vma->active) {
 		drm_gem_object_reference(&obj->base);
 		vma->active = 1;
 	}
 
-	list_move_tail(&vma->mm_list, &vm->active_list);
+	/* Move from whatever list we were on to the tail of execution. */
+	list_move_tail(&vma->mm_list, &vma->vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
 	obj->last_read_seqno = seqno;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index cc7c0b4..8177a2d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -441,9 +441,10 @@ static int do_switch(struct i915_hw_context *to)
 	 * MI_SET_CONTEXT instead of when the next seqno has completed.
 	 */
 	if (from != NULL) {
+		struct i915_vma *vma =
+			i915_gem_obj_to_vma(from->obj, &dev_priv->gtt.base);
 		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
-					       ring);
+		i915_gem_vma_move_to_active(vma, ring);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
 		 * object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 75325c9..8559947 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -798,7 +798,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 		obj->base.read_domains = obj->base.pending_read_domains;
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
-		i915_gem_object_move_to_active(obj, vma->vm, ring);
+		i915_gem_vma_move_to_active(vma, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
-- 
1.8.3.3

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 00/12] Completion of i915 VMAs
  2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
                   ` (11 preceding siblings ...)
  2013-07-22  2:08 ` [PATCH 12/12] drm/i915: Convert active API " Ben Widawsky
@ 2013-07-22 10:42 ` Chris Wilson
  2013-07-22 16:35   ` Ben Widawsky
  12 siblings, 1 reply; 48+ messages in thread
From: Chris Wilson @ 2013-07-22 10:42 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sun, Jul 21, 2013 at 07:08:07PM -0700, Ben Widawsky wrote:
> Map, and unmap are logical functionalities to add for an address space.
> They do more or less what you'd think: take an object and create a
> mapping via the GPU's page tables to that object. Of course, without the
> rest of the patches from [3], there will only ever be 1 address space,
> with the weird aliasing ppgtt behind it. One thing which I toyed with,
> but opted not to include was to directly pass obj,vm to map/unmap
> instead of doing the slightly less pretty way as I've done in execbuf
> and bind. In the future I think I may just do this, but for now it's not
> a big win as the end result wasn't much better (and I didn't get it to
> immediately work).

That's annoying. Currently we use map to refer to the process of making
a CPU mapping to the objects and bind for doing it from the GPU's
perspective. And since the CPU map may well require a GPU map, keeping
the nomenclature distinct helps easily confused me.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 09/12] drm/i915: create vmas at execbuf
  2013-07-22  2:08 ` [PATCH 09/12] drm/i915: create vmas at execbuf Ben Widawsky
@ 2013-07-22 13:32   ` Chris Wilson
  0 siblings, 0 replies; 48+ messages in thread
From: Chris Wilson @ 2013-07-22 13:32 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sun, Jul 21, 2013 at 07:08:16PM -0700, Ben Widawsky wrote:
> @@ -4054,7 +4051,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  				     struct i915_address_space *vm)
>  {
> -	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
> +	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_ATOMIC);

Hmm, possibly best to erradicate the allocations from underneath the
spinlock as the number may be rather large.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 00/12] Completion of i915 VMAs
  2013-07-22 10:42 ` [PATCH 00/12] Completion of i915 VMAs Chris Wilson
@ 2013-07-22 16:35   ` Ben Widawsky
  0 siblings, 0 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-22 16:35 UTC (permalink / raw)
  To: Chris Wilson, Intel GFX

On Mon, Jul 22, 2013 at 11:42:40AM +0100, Chris Wilson wrote:
> On Sun, Jul 21, 2013 at 07:08:07PM -0700, Ben Widawsky wrote:
> > Map, and unmap are logical functionalities to add for an address space.
> > They do more or less what you'd think: take an object and create a
> > mapping via the GPU's page tables to that object. Of course, without the
> > rest of the patches from [3], there will only ever be 1 address space,
> > with the weird aliasing ppgtt behind it. One thing which I toyed with,
> > but opted not to include was to directly pass obj,vm to map/unmap
> > instead of doing the slightly less pretty way as I've done in execbuf
> > and bind. In the future I think I may just do this, but for now it's not
> > a big win as the end result wasn't much better (and I didn't get it to
> > immediately work).
> 
> That's annoying. Currently we use map to refer to the process of making
> a CPU mapping to the objects and bind for doing it from the GPU's
> perspective. And since the CPU map may well require a GPU map, keeping
> the nomenclature distinct helps easily confused me.
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

I can rename it, I expect you will review it.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/12] drm/i915: plumb VM into object operations
  2013-07-22  2:08 ` [PATCH 01/12] drm/i915: plumb VM into object operations Ben Widawsky
@ 2013-07-23 16:37   ` Daniel Vetter
  2013-07-26  9:51   ` Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations) Daniel Vetter
  1 sibling, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-07-23 16:37 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sun, Jul 21, 2013 at 07:08:08PM -0700, Ben Widawsky wrote:
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
> 
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
> 
> Some code will still need to be ported over after this.
> 
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
> 
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
> 
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
> 
> v5: Very large rebase
> 
> v6:
> Change BUG_ON to WARN_ON (Daniel)
> Rename vm to ggtt in preallocate stolen, since it is always ggtt when
> dealing with stolen memory. (Daniel)
> list_for_each will short-circuit already (Daniel)
> remove superflous space (Daniel)
> Use per object list of vmas (Daniel)
> Make obj_bound_any() use obj_bound for each vm (Ben)
> s/bind_to_gtt/bind_to_vm/ (Ben)
> 
> Fixed up the inactive shrinker. As Daniel noticed the code could
> potentially count the same object multiple times. While it's not
> possible in the current case, since 1 object can only ever be bound into
> 1 address space thus far - we may as well try to get something more
> future proof in place now. With a prep patch before this to switch over
> to using the bound list + inactive check, we're now able to carry that
> forward for every address space an object is bound into.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Ok, I think this patch is too big and needs to be split up. Atm there's
way too many changes in here to be able to do a real review. Things I've
noticed while reading through it
- The set_color interface looks really strange. We loop over all vma, but
  then pass in the (obj, vm) pair so that we _again_ loop over all vmas to
  figure out the right one again to finally set the color.
- The function renaming should imo be split out as much as possible.
- There's some variable renaming like s/alignment/align/. Imo just drop
  that part.
- Some localized prep work without changing function interface should also
  go in separate patches imo, like using ggtt_vm pointers more.

Overall I still think that the little attribute helpers should accept a
vma parameter, not an (obj, vm) pair.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  29 ++-
>  drivers/gpu/drm/i915/i915_dma.c            |   4 -
>  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++----
>  drivers/gpu/drm/i915/i915_gem.c            | 337 +++++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |  10 +-
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
>  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
>  drivers/gpu/drm/i915/intel_fb.c            |   1 -
>  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
>  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
>  15 files changed, 479 insertions(+), 245 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index be69807..f8e590f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -92,6 +92,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj)
>  static void
>  describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  {
> +	struct i915_vma *vma;
>  	seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s",
>  		   &obj->base,
>  		   get_pin_flag(obj),
> @@ -111,9 +112,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  		seq_printf(m, " (pinned x %d)", obj->pin_count);
>  	if (obj->fence_reg != I915_FENCE_REG_NONE)
>  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> -	if (i915_gem_obj_ggtt_bound(obj))
> -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		if (!i915_is_ggtt(vma->vm))
> +			seq_puts(m, " (pp");
> +		else
> +			seq_puts(m, " (g");
> +		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
> +			   i915_gem_obj_offset(obj, vma->vm),
> +			   i915_gem_obj_size(obj, vma->vm));
> +	}
>  	if (obj->stolen)
>  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
>  	if (obj->pin_mappable || obj->fault_mappable) {
> @@ -175,6 +182,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +/* FIXME: Support multiple VM? */
>  #define count_objects(list, member) do { \
>  	list_for_each_entry(obj, list, member) { \
>  		size += i915_gem_obj_ggtt_size(obj); \
> @@ -1781,18 +1789,21 @@ i915_drop_caches_set(void *data, u64 val)
>  
>  	if (val & DROP_BOUND) {
>  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -					 mm_list)
> -			if (obj->pin_count == 0) {
> -				ret = i915_gem_object_unbind(obj);
> -				if (ret)
> -					goto unlock;
> -			}
> +					 mm_list) {
> +			if (obj->pin_count)
> +				continue;
> +
> +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> +			if (ret)
> +				goto unlock;
> +		}
>  	}
>  
>  	if (val & DROP_UNBOUND) {
>  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
>  					 global_list)
>  			if (obj->pages_pin_count == 0) {
> +				/* FIXME: Do this for all vms? */
>  				ret = i915_gem_object_put_pages(obj);
>  				if (ret)
>  					goto unlock;
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 1449d06..4650519 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  
>  	i915_dump_device_info(dev_priv);
>  
> -	INIT_LIST_HEAD(&dev_priv->vm_list);
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> -
>  	if (i915_get_bridge_dev(dev)) {
>  		ret = -EIO;
>  		goto free_priv;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8b3167e..681cb41 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1379,52 +1379,6 @@ struct drm_i915_gem_object {
>  
>  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
>  
> -/* This is a temporary define to help transition us to real VMAs. If you see
> - * this, you're either reviewing code, or bisecting it. */
> -static inline struct i915_vma *
> -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> -{
> -	if (list_empty(&obj->vma_list))
> -		return NULL;
> -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> -}
> -
> -/* Whether or not this object is currently mapped by the translation tables */
> -static inline bool
> -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> -{
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> -	if (vma == NULL)
> -		return false;
> -	return drm_mm_node_allocated(&vma->node);
> -}
> -
> -/* Offset of the first PTE pointing to this object */
> -static inline unsigned long
> -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> -{
> -	BUG_ON(list_empty(&o->vma_list));
> -	return __i915_gem_obj_to_vma(o)->node.start;
> -}
> -
> -/* The size used in the translation tables may be larger than the actual size of
> - * the object on GEN2/GEN3 because of the way tiling is handled. See
> - * i915_gem_get_gtt_size() for more details.
> - */
> -static inline unsigned long
> -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> -{
> -	BUG_ON(list_empty(&o->vma_list));
> -	return __i915_gem_obj_to_vma(o)->node.size;
> -}
> -
> -static inline void
> -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> -			    enum i915_cache_level color)
> -{
> -	__i915_gem_obj_to_vma(o)->node.color = color;
> -}
> -
>  /**
>   * Request queue structure.
>   *
> @@ -1736,11 +1690,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  void i915_gem_vma_destroy(struct i915_vma *vma);
>  
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm,
>  				     uint32_t alignment,
>  				     bool map_and_fenceable,
>  				     bool nonblocking);
>  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +					struct i915_address_space *vm);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
>  void i915_gem_lastclose(struct drm_device *dev);
> @@ -1770,6 +1726,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  			 struct intel_ring_buffer *to);
>  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    struct intel_ring_buffer *ring);
>  
>  int i915_gem_dumb_create(struct drm_file *file_priv,
> @@ -1876,6 +1833,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
>  			    int tiling_mode, bool fenced);
>  
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    enum i915_cache_level cache_level);
>  
>  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> @@ -1886,6 +1844,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>  
>  void i915_gem_restore_fences(struct drm_device *dev);
>  
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +				  struct i915_address_space *vm);
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +			struct i915_address_space *vm);
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +				struct i915_address_space *vm);
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +			    struct i915_address_space *vm,
> +			    enum i915_cache_level color);
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm);
> +/* Some GGTT VM helpers */
> +#define obj_to_ggtt(obj) \
> +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> +{
> +	struct i915_address_space *ggtt =
> +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> +	return vm == ggtt;
> +}
> +
> +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline int __must_check
> +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> +		  uint32_t alignment,
> +		  bool map_and_fenceable,
> +		  bool nonblocking)
> +{
> +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> +				   map_and_fenceable, nonblocking);
> +}
> +#undef obj_to_ggtt
> +
>  /* i915_gem_context.c */
>  void i915_gem_context_init(struct drm_device *dev);
>  void i915_gem_context_fini(struct drm_device *dev);
> @@ -1922,6 +1930,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> +/* FIXME: this is never okay with full PPGTT */
>  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>  				enum i915_cache_level cache_level);
>  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> @@ -1938,7 +1947,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
>  
>  
>  /* i915_gem_evict.c */
> -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> +int __must_check i915_gem_evict_something(struct drm_device *dev,
> +					  struct i915_address_space *vm,
> +					  int min_size,
>  					  unsigned alignment,
>  					  unsigned cache_level,
>  					  bool mappable,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2283765..0111554 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -38,10 +38,12 @@
>  
>  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
>  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -						    unsigned alignment,
> -						    bool map_and_fenceable,
> -						    bool nonblocking);
> +static __must_check int
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +			   struct i915_address_space *vm,
> +			   unsigned alignment,
> +			   bool map_and_fenceable,
> +			   bool nonblocking);
>  static int i915_gem_phys_pwrite(struct drm_device *dev,
>  				struct drm_i915_gem_object *obj,
>  				struct drm_i915_gem_pwrite *args,
> @@ -120,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> +	return i915_gem_obj_bound_any(obj) && !obj->active;
>  }
>  
>  int
> @@ -406,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
>  		 * anyway again before the next pread happens. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
>  			if (ret)
>  				return ret;
> @@ -578,7 +580,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>  	char __user *user_data;
>  	int page_offset, page_length, ret;
>  
> -	ret = i915_gem_object_pin(obj, 0, true, true);
> +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
>  	if (ret)
>  		goto out;
>  
> @@ -723,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>  		 * right away and we therefore have to clflush anyway. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush_after = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
>  			if (ret)
>  				return ret;
> @@ -1332,7 +1334,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>  	}
>  
>  	/* Now bind it into the GTT if needed */
> -	ret = i915_gem_object_pin(obj, 0, true, false);
> +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
>  	if (ret)
>  		goto unlock;
>  
> @@ -1654,11 +1656,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>  	if (obj->pages == NULL)
>  		return 0;
>  
> -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> -
>  	if (obj->pages_pin_count)
>  		return -EBUSY;
>  
> +	BUG_ON(i915_gem_obj_bound_any(obj));
> +
>  	/* ->put_pages might need to allocate memory for the bit17 swizzle
>  	 * array, hence protect them from being reaped by removing them from gtt
>  	 * lists early. */
> @@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		  bool purgeable_only)
>  {
>  	struct drm_i915_gem_object *obj, *next;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	long count = 0;
>  
>  	list_for_each_entry_safe(obj, next,
> @@ -1692,14 +1693,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		}
>  	}
>  
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> -		    i915_gem_object_unbind(obj) == 0 &&
> -		    i915_gem_object_put_pages(obj) == 0) {
> +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> +				 global_list) {
> +		struct i915_vma *vma, *v;
> +
> +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> +			continue;
> +
> +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> +			if (i915_gem_object_unbind(obj, vma->vm))
> +				break;
> +
> +		if (!i915_gem_object_put_pages(obj))
>  			count += obj->base.size >> PAGE_SHIFT;
> -			if (count >= target)
> -				return count;
> -		}
> +
> +		if (count >= target)
> +			return count;
>  	}
>  
>  	return count;
> @@ -1859,11 +1868,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  
>  void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +			       struct i915_address_space *vm,
>  			       struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 seqno = intel_ring_get_seqno(ring);
>  
>  	BUG_ON(ring == NULL);
> @@ -1900,12 +1909,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>  
>  static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> +				 struct i915_address_space *vm)
>  {
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> -
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> @@ -2105,10 +2111,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
>  	spin_unlock(&file_priv->mm.lock);
>  }
>  
> -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm)
>  {
> -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
>  		return true;
>  
>  	return false;
> @@ -2131,6 +2138,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
>  	return false;
>  }
>  
> +static struct i915_address_space *
> +request_to_vm(struct drm_i915_gem_request *request)
> +{
> +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> +	struct i915_address_space *vm;
> +
> +	vm = &dev_priv->gtt.base;
> +
> +	return vm;
> +}
> +
>  static bool i915_request_guilty(struct drm_i915_gem_request *request,
>  				const u32 acthd, bool *inside)
>  {
> @@ -2138,9 +2156,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
>  	 * pointing inside the ring, matches the batch_obj address range.
>  	 * However this is extremely unlikely.
>  	 */
> -
>  	if (request->batch_obj) {
> -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> +		if (i915_head_inside_object(acthd, request->batch_obj,
> +					    request_to_vm(request))) {
>  			*inside = true;
>  			return true;
>  		}
> @@ -2160,17 +2178,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
>  {
>  	struct i915_ctx_hang_stats *hs = NULL;
>  	bool inside, guilty;
> +	unsigned long offset = 0;
>  
>  	/* Innocent until proven guilty */
>  	guilty = false;
>  
> +	if (request->batch_obj)
> +		offset = i915_gem_obj_offset(request->batch_obj,
> +					     request_to_vm(request));
> +
>  	if (ring->hangcheck.action != wait &&
>  	    i915_request_guilty(request, acthd, &inside)) {
>  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
>  			  ring->name,
>  			  inside ? "inside" : "flushing",
> -			  request->batch_obj ?
> -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> +			  offset,
>  			  request->ctx ? request->ctx->id : 0,
>  			  acthd);
>  
> @@ -2227,13 +2249,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
>  	}
>  
>  	while (!list_empty(&ring->active_list)) {
> +		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
>  				       struct drm_i915_gem_object,
>  				       ring_list);
>  
> -		i915_gem_object_move_to_inactive(obj);
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +			i915_gem_object_move_to_inactive(obj, vm);
>  	}
>  }
>  
> @@ -2261,7 +2285,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
>  	int i;
> @@ -2272,8 +2296,9 @@ void i915_gem_reset(struct drm_device *dev)
>  	/* Move everything out of the GPU domains to ensure we do any
>  	 * necessary invalidation upon reuse.
>  	 */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>  
>  	i915_gem_restore_fences(dev);
>  }
> @@ -2318,6 +2343,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
>  	 */
>  	while (!list_empty(&ring->active_list)) {
> +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
> @@ -2327,7 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>  			break;
>  
> -		i915_gem_object_move_to_inactive(obj);
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +			i915_gem_object_move_to_inactive(obj, vm);
>  	}
>  
>  	if (unlikely(ring->trace_irq_seqno &&
> @@ -2573,13 +2601,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
>   * Unbinds an object from the GTT aperture.
>   */
>  int
> -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
>  	struct i915_vma *vma;
>  	int ret;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound(obj, vm))
>  		return 0;
>  
>  	if (obj->pin_count)
> @@ -2602,7 +2631,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	if (ret)
>  		return ret;
>  
> -	trace_i915_gem_object_unbind(obj);
> +	trace_i915_gem_object_unbind(obj, vm);
>  
>  	if (obj->has_global_gtt_mapping)
>  		i915_gem_gtt_unbind_object(obj);
> @@ -2617,7 +2646,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	obj->map_and_fenceable = true;
>  
> -	vma = __i915_gem_obj_to_vma(obj);
> +	vma = i915_gem_obj_to_vma(obj, vm);
>  	list_del(&vma->vma_link);
>  	drm_mm_remove_node(&vma->node);
>  	i915_gem_vma_destroy(vma);
> @@ -2764,6 +2793,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
>  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
>  		     i915_gem_obj_ggtt_offset(obj), size);
>  
> +
>  		pitch_val = obj->stride / 128;
>  		pitch_val = ffs(pitch_val) - 1;
>  
> @@ -3049,24 +3079,26 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>   * Finds free space in the GTT aperture and binds the object there.
>   */
>  static int
> -i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -			    unsigned alignment,
> -			    bool map_and_fenceable,
> -			    bool nonblocking)
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +			   struct i915_address_space *vm,
> +			   unsigned alignment,
> +			   bool map_and_fenceable,
> +			   bool nonblocking)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 size, fence_size, fence_alignment, unfenced_alignment;
>  	bool mappable, fenceable;
> -	size_t gtt_max = map_and_fenceable ?
> -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> +	size_t gtt_max =
> +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
>  	struct i915_vma *vma;
>  	int ret;
>  
>  	if (WARN_ON(!list_empty(&obj->vma_list)))
>  		return -EBUSY;
>  
> +	BUG_ON(!i915_is_ggtt(vm));
> +
>  	fence_size = i915_gem_get_gtt_size(dev,
>  					   obj->base.size,
>  					   obj->tiling_mode);
> @@ -3105,19 +3137,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  
>  	i915_gem_object_pin_pages(obj);
>  
> -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	/* For now we only ever use 1 vma per object */
> +	WARN_ON(!list_empty(&obj->vma_list));
> +
> +	vma = i915_gem_vma_create(obj, vm);
>  	if (IS_ERR(vma)) {
>  		i915_gem_object_unpin_pages(obj);
>  		return PTR_ERR(vma);
>  	}
>  
>  search_free:
> -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> -						  &vma->node,
> +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>  						  size, alignment,
>  						  obj->cache_level, 0, gtt_max);
>  	if (ret) {
> -		ret = i915_gem_evict_something(dev, size, alignment,
> +		ret = i915_gem_evict_something(dev, vm, size, alignment,
>  					       obj->cache_level,
>  					       map_and_fenceable,
>  					       nonblocking);
> @@ -3138,18 +3172,25 @@ search_free:
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> -	list_add(&vma->vma_link, &obj->vma_list);
> +
> +	/* Keep GGTT vmas first to make debug easier */
> +	if (i915_is_ggtt(vm))
> +		list_add(&vma->vma_link, &obj->vma_list);
> +	else
> +		list_add_tail(&vma->vma_link, &obj->vma_list);
>  
>  	fenceable =
> +		i915_is_ggtt(vm) &&
>  		i915_gem_obj_ggtt_size(obj) == fence_size &&
>  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
>  
> -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> -		dev_priv->gtt.mappable_end;
> +	mappable =
> +		i915_is_ggtt(vm) &&
> +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>  
>  	obj->map_and_fenceable = mappable && fenceable;
>  
> -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
>  	i915_gem_verify_gtt(dev);
>  	return 0;
>  
> @@ -3253,7 +3294,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  	int ret;
>  
>  	/* Not valid to be called on unbound objects. */
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return -EINVAL;
>  
>  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> @@ -3299,11 +3340,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  }
>  
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    enum i915_cache_level cache_level)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
>  	int ret;
>  
>  	if (obj->cache_level == cache_level)
> @@ -3315,12 +3357,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  	}
>  
>  	if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> -		ret = i915_gem_object_unbind(obj);
> +		ret = i915_gem_object_unbind(obj, vm);
>  		if (ret)
>  			return ret;
>  	}
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>  		ret = i915_gem_object_finish_gpu(obj);
>  		if (ret)
>  			return ret;
> @@ -3343,7 +3385,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
>  					       obj, cache_level);
>  
> -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> +		i915_gem_obj_set_color(obj, vma->vm, cache_level);
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -3403,6 +3445,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file)
>  {
>  	struct drm_i915_gem_caching *args = data;
> +	struct drm_i915_private *dev_priv;
>  	struct drm_i915_gem_object *obj;
>  	enum i915_cache_level level;
>  	int ret;
> @@ -3427,8 +3470,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  		ret = -ENOENT;
>  		goto unlock;
>  	}
> +	dev_priv = obj->base.dev->dev_private;
>  
> -	ret = i915_gem_object_set_cache_level(obj, level);
> +	/* FIXME: Add interface for specific VM? */
> +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
>  
>  	drm_gem_object_unreference(&obj->base);
>  unlock:
> @@ -3446,6 +3491,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  				     u32 alignment,
>  				     struct intel_ring_buffer *pipelined)
>  {
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  	u32 old_read_domains, old_write_domain;
>  	int ret;
>  
> @@ -3464,7 +3510,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  	 * of uncaching, which would allow us to flush all the LLC-cached data
>  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
>  	 */
> -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					      I915_CACHE_NONE);
>  	if (ret)
>  		return ret;
>  
> @@ -3472,7 +3519,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  	 * (e.g. libkms for the bootup splash), we have to ensure that we
>  	 * always use map_and_fenceable for all scanout buffers.
>  	 */
> -	ret = i915_gem_object_pin(obj, alignment, true, false);
> +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
>  	if (ret)
>  		return ret;
>  
> @@ -3615,6 +3662,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  
>  int
>  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +		    struct i915_address_space *vm,
>  		    uint32_t alignment,
>  		    bool map_and_fenceable,
>  		    bool nonblocking)
> @@ -3624,28 +3672,31 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>  		return -EBUSY;
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> +	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
> +
> +	if (i915_gem_obj_bound(obj, vm)) {
> +		if ((alignment &&
> +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
>  		    (map_and_fenceable && !obj->map_and_fenceable)) {
>  			WARN(obj->pin_count,
>  			     "bo is already pinned with incorrect alignment:"
>  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>  			     " obj->map_and_fenceable=%d\n",
> -			     i915_gem_obj_ggtt_offset(obj), alignment,
> +			     i915_gem_obj_offset(obj, vm), alignment,
>  			     map_and_fenceable,
>  			     obj->map_and_fenceable);
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_unbind(obj, vm);
>  			if (ret)
>  				return ret;
>  		}
>  	}
>  
> -	if (!i915_gem_obj_ggtt_bound(obj)) {
> +	if (!i915_gem_obj_bound(obj, vm)) {
>  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  
> -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> -						  map_and_fenceable,
> -						  nonblocking);
> +		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
> +						 map_and_fenceable,
> +						 nonblocking);
>  		if (ret)
>  			return ret;
>  
> @@ -3666,7 +3717,7 @@ void
>  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
>  {
>  	BUG_ON(obj->pin_count == 0);
> -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> +	BUG_ON(!i915_gem_obj_bound_any(obj));
>  
>  	if (--obj->pin_count == 0)
>  		obj->pin_mappable = false;
> @@ -3704,7 +3755,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
>  	}
>  
>  	if (obj->user_pin_count == 0) {
> -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
>  		if (ret)
>  			goto out;
>  	}
> @@ -3937,6 +3988,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct i915_vma *vma, *next;
>  
>  	trace_i915_gem_object_destroy(obj);
>  
> @@ -3944,15 +3996,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  		i915_gem_detach_phys_object(dev, obj);
>  
>  	obj->pin_count = 0;
> -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> -		bool was_interruptible;
> +	/* NB: 0 or 1 elements */
> +	WARN_ON(!list_empty(&obj->vma_list) &&
> +		!list_is_singular(&obj->vma_list));
> +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> +		int ret = i915_gem_object_unbind(obj, vma->vm);
> +		if (WARN_ON(ret == -ERESTARTSYS)) {
> +			bool was_interruptible;
>  
> -		was_interruptible = dev_priv->mm.interruptible;
> -		dev_priv->mm.interruptible = false;
> +			was_interruptible = dev_priv->mm.interruptible;
> +			dev_priv->mm.interruptible = false;
>  
> -		WARN_ON(i915_gem_object_unbind(obj));
> +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
>  
> -		dev_priv->mm.interruptible = was_interruptible;
> +			dev_priv->mm.interruptible = was_interruptible;
> +		}
>  	}
>  
>  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> @@ -4319,6 +4377,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
>  	INIT_LIST_HEAD(&ring->request_list);
>  }
>  
> +static void i915_init_vm(struct drm_i915_private *dev_priv,
> +			 struct i915_address_space *vm)
> +{
> +	vm->dev = dev_priv->dev;
> +	INIT_LIST_HEAD(&vm->active_list);
> +	INIT_LIST_HEAD(&vm->inactive_list);
> +	INIT_LIST_HEAD(&vm->global_link);
> +	list_add(&vm->global_link, &dev_priv->vm_list);
> +}
> +
>  void
>  i915_gem_load(struct drm_device *dev)
>  {
> @@ -4331,8 +4399,9 @@ i915_gem_load(struct drm_device *dev)
>  				  SLAB_HWCACHE_ALIGN,
>  				  NULL);
>  
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> +	INIT_LIST_HEAD(&dev_priv->vm_list);
> +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> +
>  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> @@ -4603,9 +4672,8 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  			     struct drm_i915_private,
>  			     mm.inactive_shrinker);
>  	struct drm_device *dev = dev_priv->dev;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct drm_i915_gem_object *obj;
> -	int nr_to_scan = sc->nr_to_scan;
> +	int nr_to_scan;
>  	bool unlock = true;
>  	int cnt;
>  
> @@ -4619,6 +4687,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  		unlock = false;
>  	}
>  
> +	nr_to_scan = sc->nr_to_scan;
>  	if (nr_to_scan) {
>  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
>  		if (nr_to_scan > 0)
> @@ -4632,11 +4701,109 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
>  		if (obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +
> +	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> +		if (obj->active)
> +			continue;
> +
> +		i915_gem_object_flush_gtt_write_domain(obj);
> +		i915_gem_object_flush_cpu_write_domain(obj);
> +		/* FIXME: Can't assume global gtt */
> +		i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base);
> +
>  		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
> +	}
>  
>  	if (unlock)
>  		mutex_unlock(&dev->struct_mutex);
>  	return cnt;
>  }
> +
> +/* All the new VM stuff */
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +				  struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_vma *vma;
> +
> +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +		vm = &dev_priv->gtt.base;
> +
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm)
> +			return vma->node.start;
> +
> +	}
> +	return -1;
> +}
> +
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +			struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +
> +	list_for_each_entry(vma, &o->vma_list, vma_link)
> +		if (vma->vm == vm)
> +			return true;
> +
> +	return false;
> +}
> +
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_address_space *vm;
> +
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		if (i915_gem_obj_bound(o, vm))
> +			return true;
> +
> +	return false;
> +}
> +
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +				struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_vma *vma;
> +
> +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +		vm = &dev_priv->gtt.base;
> +
> +	BUG_ON(list_empty(&o->vma_list));
> +
> +	list_for_each_entry(vma, &o->vma_list, vma_link)
> +		if (vma->vm == vm)
> +			return vma->node.size;
> +
> +	return 0;
> +}
> +
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +			    struct i915_address_space *vm,
> +			    enum i915_cache_level color)
> +{
> +	struct i915_vma *vma;
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm) {
> +			vma->node.color = color;
> +			return;
> +		}
> +	}
> +
> +	WARN(1, "Couldn't set color for VM %p\n", vm);
> +}
> +
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> +		if (vma->vm == vm)
> +			return vma;
> +
> +	return NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2470206..873577d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
>  
>  	if (INTEL_INFO(dev)->gen >= 7) {
>  		ret = i915_gem_object_set_cache_level(ctx->obj,
> +						      &dev_priv->gtt.base,
>  						      I915_CACHE_LLC_MLC);
>  		/* Failure shouldn't ever happen this early */
>  		if (WARN_ON(ret))
> @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
>  	 * default context.
>  	 */
>  	dev_priv->ring[RCS].default_context = ctx;
> -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
>  	if (ret) {
>  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
>  		goto err_destroy;
> @@ -391,6 +392,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  static int do_switch(struct i915_hw_context *to)
>  {
>  	struct intel_ring_buffer *ring = to->ring;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct i915_hw_context *from = ring->last_context;
>  	u32 hw_flags = 0;
>  	int ret;
> @@ -400,7 +402,7 @@ static int do_switch(struct i915_hw_context *to)
>  	if (from == to)
>  		return 0;
>  
> -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
>  	if (ret)
>  		return ret;
>  
> @@ -437,7 +439,8 @@ static int do_switch(struct i915_hw_context *to)
>  	 */
>  	if (from != NULL) {
>  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> -		i915_gem_object_move_to_active(from->obj, ring);
> +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> +					       ring);
>  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
>  		 * whole damn pipeline, we don't need to explicitly mark the
>  		 * object dirty. The only exception is that the context must be
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index df61f33..32efdc0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -32,24 +32,21 @@
>  #include "i915_trace.h"
>  
>  static bool
> -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> +mark_free(struct i915_vma *vma, struct list_head *unwind)
>  {
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> -
> -	if (obj->pin_count)
> +	if (vma->obj->pin_count)
>  		return false;
>  
> -	list_add(&obj->exec_list, unwind);
> +	list_add(&vma->obj->exec_list, unwind);
>  	return drm_mm_scan_add_block(&vma->node);
>  }
>  
>  int
> -i915_gem_evict_something(struct drm_device *dev, int min_size,
> -			 unsigned alignment, unsigned cache_level,
> +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> +			 int min_size, unsigned alignment, unsigned cache_level,
>  			 bool mappable, bool nonblocking)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct list_head eviction_list, unwind_list;
>  	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
> @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  	 */
>  
>  	INIT_LIST_HEAD(&unwind_list);
> -	if (mappable)
> +	if (mappable) {
> +		BUG_ON(!i915_is_ggtt(vm));
>  		drm_mm_init_scan_with_range(&vm->mm, min_size,
>  					    alignment, cache_level, 0,
>  					    dev_priv->gtt.mappable_end);
> -	else
> +	} else
>  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
>  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  
>  	/* Now merge in the soon-to-be-expired objects... */
>  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -109,7 +109,7 @@ none:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = __i915_gem_obj_to_vma(obj);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		ret = drm_mm_scan_remove_block(&vma->node);
>  		BUG_ON(ret);
>  
> @@ -130,7 +130,7 @@ found:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = __i915_gem_obj_to_vma(obj);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		if (drm_mm_scan_remove_block(&vma->node)) {
>  			list_move(&obj->exec_list, &eviction_list);
>  			drm_gem_object_reference(&obj->base);
> @@ -145,7 +145,7 @@ found:
>  				       struct drm_i915_gem_object,
>  				       exec_list);
>  		if (ret == 0)
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_unbind(obj, vm);
>  
>  		list_del_init(&obj->exec_list);
>  		drm_gem_object_unreference(&obj->base);
> @@ -158,13 +158,18 @@ int
>  i915_gem_evict_everything(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj, *next;
> -	bool lists_empty;
> +	bool lists_empty = true;
>  	int ret;
>  
> -	lists_empty = (list_empty(&vm->inactive_list) &&
> -		       list_empty(&vm->active_list));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		lists_empty = (list_empty(&vm->inactive_list) &&
> +			       list_empty(&vm->active_list));
> +		if (!lists_empty)
> +			lists_empty = false;
> +	}
> +
>  	if (lists_empty)
>  		return -ENOSPC;
>  
> @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
>  	i915_gem_retire_requests(dev);
>  
>  	/* Having flushed everything, unbind() should never raise an error */
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -		if (obj->pin_count == 0)
> -			WARN_ON(i915_gem_object_unbind(obj));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> +			if (obj->pin_count == 0)
> +				WARN_ON(i915_gem_object_unbind(obj, vm));
> +	}
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 1734825..819d8d8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
>  }
>  
>  static void
> -eb_destroy(struct eb_objects *eb)
> +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
>  {
>  	while (!list_empty(&eb->objects)) {
>  		struct drm_i915_gem_object *obj;
> @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  				   struct eb_objects *eb,
> -				   struct drm_i915_gem_relocation_entry *reloc)
> +				   struct drm_i915_gem_relocation_entry *reloc,
> +				   struct i915_address_space *vm)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_gem_object *target_obj;
> @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  
>  static int
>  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> -				    struct eb_objects *eb)
> +				    struct eb_objects *eb,
> +				    struct i915_address_space *vm)
>  {
>  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
>  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  		do {
>  			u64 offset = r->presumed_offset;
>  
> -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> +								 vm);
>  			if (ret)
>  				return ret;
>  
> @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  static int
>  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  					 struct eb_objects *eb,
> -					 struct drm_i915_gem_relocation_entry *relocs)
> +					 struct drm_i915_gem_relocation_entry *relocs,
> +					 struct i915_address_space *vm)
>  {
>  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  	int i, ret;
>  
>  	for (i = 0; i < entry->relocation_count; i++) {
> -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> +							 vm);
>  		if (ret)
>  			return ret;
>  	}
> @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  }
>  
>  static int
> -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> +			     struct i915_address_space *vm)
>  {
>  	struct drm_i915_gem_object *obj;
>  	int ret = 0;
> @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
>  	 */
>  	pagefault_disable();
>  	list_for_each_entry(obj, &eb->objects, exec_list) {
> -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
>  		if (ret)
>  			break;
>  	}
> @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  				   struct intel_ring_buffer *ring,
> +				   struct i915_address_space *vm,
>  				   bool *need_reloc)
>  {
>  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->tiling_mode != I915_TILING_NONE;
>  	need_mappable = need_fence || need_reloc_mappable(obj);
>  
> -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> +				  false);
>  	if (ret)
>  		return ret;
>  
> @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->has_aliasing_ppgtt_mapping = 1;
>  	}
>  
> -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> +		entry->offset = i915_gem_obj_offset(obj, vm);
>  		*need_reloc = true;
>  	}
>  
> @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_gem_exec_object2 *entry;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return;
>  
>  	entry = obj->exec_entry;
> @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  			    struct list_head *objects,
> +			    struct i915_address_space *vm,
>  			    bool *need_relocs)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  		list_for_each_entry(obj, objects, exec_list) {
>  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  			bool need_fence, need_mappable;
> +			u32 obj_offset;
>  
> -			if (!i915_gem_obj_ggtt_bound(obj))
> +			if (!i915_gem_obj_bound(obj, vm))
>  				continue;
>  
> +			obj_offset = i915_gem_obj_offset(obj, vm);
>  			need_fence =
>  				has_fenced_gpu_access &&
>  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>  				obj->tiling_mode != I915_TILING_NONE;
>  			need_mappable = need_fence || need_reloc_mappable(obj);
>  
> +			BUG_ON((need_mappable || need_fence) &&
> +			       !i915_is_ggtt(vm));
> +
>  			if ((entry->alignment &&
> -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> +			     obj_offset & (entry->alignment - 1)) ||
>  			    (need_mappable && !obj->map_and_fenceable))
> -				ret = i915_gem_object_unbind(obj);
> +				ret = i915_gem_object_unbind(obj, vm);
>  			else
> -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
>  
>  		/* Bind fresh objects */
>  		list_for_each_entry(obj, objects, exec_list) {
> -			if (i915_gem_obj_ggtt_bound(obj))
> +			if (i915_gem_obj_bound(obj, vm))
>  				continue;
>  
> -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
> @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  				  struct drm_file *file,
>  				  struct intel_ring_buffer *ring,
>  				  struct eb_objects *eb,
> -				  struct drm_i915_gem_exec_object2 *exec)
> +				  struct drm_i915_gem_exec_object2 *exec,
> +				  struct i915_address_space *vm)
>  {
>  	struct drm_i915_gem_relocation_entry *reloc;
>  	struct drm_i915_gem_object *obj;
> @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  		goto err;
>  
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>  	if (ret)
>  		goto err;
>  
>  	list_for_each_entry(obj, &eb->objects, exec_list) {
>  		int offset = obj->exec_entry - exec;
>  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> -							       reloc + reloc_offset[offset]);
> +							       reloc + reloc_offset[offset],
> +							       vm);
>  		if (ret)
>  			goto err;
>  	}
> @@ -770,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
>  
>  static void
>  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> +				   struct i915_address_space *vm,
>  				   struct intel_ring_buffer *ring)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -784,7 +801,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
>  		obj->base.read_domains = obj->base.pending_read_domains;
>  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>  
> -		i915_gem_object_move_to_active(obj, ring);
> +		i915_gem_object_move_to_active(obj, vm, ring);
>  		if (obj->base.write_domain) {
>  			obj->dirty = 1;
>  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> @@ -838,7 +855,8 @@ static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		       struct drm_file *file,
>  		       struct drm_i915_gem_execbuffer2 *args,
> -		       struct drm_i915_gem_exec_object2 *exec)
> +		       struct drm_i915_gem_exec_object2 *exec,
> +		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct eb_objects *eb;
> @@ -1000,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	/* Move the objects en-masse into the GTT, evicting if necessary. */
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>  	if (ret)
>  		goto err;
>  
>  	/* The objects are in their final locations, apply the relocations. */
>  	if (need_relocs)
> -		ret = i915_gem_execbuffer_relocate(eb);
> +		ret = i915_gem_execbuffer_relocate(eb, vm);
>  	if (ret) {
>  		if (ret == -EFAULT) {
>  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> -								eb, exec);
> +								eb, exec, vm);
>  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>  		}
>  		if (ret)
> @@ -1061,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  			goto err;
>  	}
>  
> -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> +		args->batch_start_offset;
>  	exec_len = args->batch_len;
>  	if (cliprects) {
>  		for (i = 0; i < args->num_cliprects; i++) {
> @@ -1086,11 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
>  
> -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
>  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
>  
>  err:
> -	eb_destroy(eb);
> +	eb_destroy(eb, vm);
>  
>  	mutex_unlock(&dev->struct_mutex);
>  
> @@ -1107,6 +1126,7 @@ int
>  i915_gem_execbuffer(struct drm_device *dev, void *data,
>  		    struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_execbuffer *args = data;
>  	struct drm_i915_gem_execbuffer2 exec2;
>  	struct drm_i915_gem_exec_object *exec_list = NULL;
> @@ -1162,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
>  	exec2.flags = I915_EXEC_RENDER;
>  	i915_execbuffer2_set_context_id(exec2, 0);
>  
> -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> +				     &dev_priv->gtt.base);
>  	if (!ret) {
>  		/* Copy the new buffer offsets back to the user's exec list. */
>  		for (i = 0; i < args->buffer_count; i++)
> @@ -1188,6 +1209,7 @@ int
>  i915_gem_execbuffer2(struct drm_device *dev, void *data,
>  		     struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_execbuffer2 *args = data;
>  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
>  	int ret;
> @@ -1218,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
>  		return -EFAULT;
>  	}
>  
> -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> +				     &dev_priv->gtt.base);
>  	if (!ret) {
>  		/* Copy the new buffer offsets back to the user's exec list. */
>  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 3b639a9..44f3464 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -390,6 +390,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
>  			    ppgtt->base.total);
>  	}
>  
> +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> +
>  	return ret;
>  }
>  
> @@ -409,17 +411,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  			    struct drm_i915_gem_object *obj,
>  			    enum i915_cache_level cache_level)
>  {
> -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				   cache_level);
> +	struct i915_address_space *vm = &ppgtt->base;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +	vm->insert_entries(vm, obj->pages,
> +			   obj_offset >> PAGE_SHIFT,
> +			   cache_level);
>  }
>  
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  			      struct drm_i915_gem_object *obj)
>  {
> -	ppgtt->base.clear_range(&ppgtt->base,
> -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				obj->base.size >> PAGE_SHIFT);
> +	struct i915_address_space *vm = &ppgtt->base;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> +			obj->base.size >> PAGE_SHIFT);
>  }
>  
>  extern int intel_iommu_gfx_mapped;
> @@ -470,6 +477,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  				       dev_priv->gtt.base.start / PAGE_SIZE,
>  				       dev_priv->gtt.base.total / PAGE_SIZE);
>  
> +	if (dev_priv->mm.aliasing_ppgtt)
> +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		i915_gem_clflush_object(obj);
>  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> @@ -648,7 +658,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	 * aperture.  One page should be enough to keep any prefetching inside
>  	 * of the aperture.
>  	 */
> -	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
>  	struct drm_mm_node *entry;
>  	struct drm_i915_gem_object *obj;
>  	unsigned long hole_start, hole_end;
> @@ -656,19 +667,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	BUG_ON(mappable_end > end);
>  
>  	/* Subtract the guard page ... */
> -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
>  	if (!HAS_LLC(dev))
>  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
>  
>  	/* Mark any preallocated objects as occupied */
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
>  		int ret;
>  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
>  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
>  
>  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
>  		if (ret)
>  			DRM_DEBUG_KMS("Reservation failed\n");
>  		obj->has_global_gtt_mapping = 1;
> @@ -679,19 +690,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	dev_priv->gtt.base.total = end - start;
>  
>  	/* Clear any non-preallocated blocks */
> -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> -			     hole_start, hole_end) {
> +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
>  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
>  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>  			      hole_start, hole_end);
> -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -					       hole_start / PAGE_SIZE,
> -					       count);
> +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
>  	}
>  
>  	/* And finally clear the reserved guard page */
> -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -				       end / PAGE_SIZE - 1, 1);
> +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
>  }
>  
>  static bool
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 27ffb4c..000ffbd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  					       u32 size)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *ggtt = &dev_priv->gtt.base;
>  	struct drm_i915_gem_object *obj;
>  	struct drm_mm_node *stolen;
>  	struct i915_vma *vma;
> @@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	if (gtt_offset == I915_GTT_OFFSET_NONE)
>  		return obj;
>  
> -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	vma = i915_gem_vma_create(obj, ggtt);
>  	if (IS_ERR(vma)) {
>  		ret = PTR_ERR(vma);
>  		goto err_out;
> @@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	 */
>  	vma->node.start = gtt_offset;
>  	vma->node.size = size;
> -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +	if (drm_mm_initialized(&ggtt->mm)) {
> +		ret = drm_mm_reserve_node(&ggtt->mm, &vma->node);
>  		if (ret) {
>  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
>  			i915_gem_vma_destroy(vma);
> @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	obj->has_global_gtt_mapping = 1;
>  
>  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> +	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
>  
>  	return obj;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 92a8d27..808ca2a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  
>  		obj->map_and_fenceable =
>  			!i915_gem_obj_ggtt_bound(obj) ||
> -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> +			(i915_gem_obj_ggtt_offset(obj) +
> +			 obj->base.size <= dev_priv->gtt.mappable_end &&
>  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
>  
>  		/* Rebind if we need a change of alignment */
>  		if (!obj->map_and_fenceable) {
> -			u32 unfenced_alignment =
> +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> +			u32 unfenced_align =
>  				i915_gem_get_gtt_alignment(dev, obj->base.size,
>  							    args->tiling_mode,
>  							    false);
> -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> -				ret = i915_gem_object_unbind(obj);
> +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> +				ret = i915_gem_object_unbind(obj, ggtt);
>  		}
>  
>  		if (ret == 0) {
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 7d283b5..3f019d3 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
>  );
>  
>  TRACE_EVENT(i915_gem_object_bind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> -	    TP_ARGS(obj, mappable),
> +	    TP_PROTO(struct drm_i915_gem_object *obj,
> +		     struct i915_address_space *vm, bool mappable),
> +	    TP_ARGS(obj, vm, mappable),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     __field(bool, mappable)
> @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> +			   __entry->size = i915_gem_obj_size(obj, vm);
>  			   __entry->mappable = mappable;
>  			   ),
>  
> @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
>  );
>  
>  TRACE_EVENT(i915_gem_object_unbind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj),
> -	    TP_ARGS(obj),
> +	    TP_PROTO(struct drm_i915_gem_object *obj,
> +		     struct i915_address_space *vm),
> +	    TP_ARGS(obj, vm),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     ),
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> +			   __entry->size = i915_gem_obj_size(obj, vm);
>  			   ),
>  
>  	    TP_printk("obj=%p, offset=%08x size=%x",
> diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> index f3c97e0..b69cc63 100644
> --- a/drivers/gpu/drm/i915/intel_fb.c
> +++ b/drivers/gpu/drm/i915/intel_fb.c
> @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  		      fb->width, fb->height,
>  		      i915_gem_obj_ggtt_offset(obj), obj);
>  
> -
>  	mutex_unlock(&dev->struct_mutex);
>  	vga_switcheroo_client_fb_set(dev->pdev, info);
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 2abb53e..22ccb7e 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
>  		}
>  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
>  	} else {
> -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
>  		if (ret) {
>  			DRM_ERROR("failed to pin overlay register bo\n");
>  			goto out_free_bo;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 008e0e0..0fb081c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev)
>  		return NULL;
>  	}
>  
> -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
>  	if (ret) {
>  		DRM_ERROR("failed to pin power context: %d\n", ret);
>  		goto err_unref;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8527ea0..88130a3 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -481,6 +481,7 @@ out:
>  static int
>  init_pipe_control(struct intel_ring_buffer *ring)
>  {
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct pipe_control *pc;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
> @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
>  		goto err;
>  	}
>  
> -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					I915_CACHE_LLC);
>  
> -	ret = i915_gem_object_pin(obj, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>  	if (ret)
>  		goto err_unref;
>  
> @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
>  static int init_status_page(struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
>  
> @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
>  		goto err;
>  	}
>  
> -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					I915_CACHE_LLC);
>  
> -	ret = i915_gem_object_pin(obj, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>  	if (ret != 0) {
>  		goto err_unref;
>  	}
> @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>  
>  	ring->obj = obj;
>  
> -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
>  	if (ret)
>  		goto err_unref;
>  
> @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  			return -ENOMEM;
>  		}
>  
> -		ret = i915_gem_object_pin(obj, 0, true, false);
> +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
>  		if (ret != 0) {
>  			drm_gem_object_unreference(&obj->base);
>  			DRM_ERROR("Failed to ping batch bo\n");
> -- 
> 1.8.3.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/12] drm/i915: Fix up map and fenceable for VMA
  2013-07-22  2:08 ` [PATCH 02/12] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
@ 2013-07-23 16:42   ` Daniel Vetter
  2013-07-23 18:14     ` Ben Widawsky
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Vetter @ 2013-07-23 16:42 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sun, Jul 21, 2013 at 07:08:09PM -0700, Ben Widawsky wrote:
> formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
> tracking"
> 
> The map_and_fenceable tracking is per object. GTT mapping, and fences
> only apply to global GTT. As such,  object operations which are not
> performed on the global GTT should not effect mappable or fenceable
> characteristics.
> 
> Functionally, this commit could very well be squashed in to the previous
> patch which updated object operations to take a VM argument.  This
> commit is split out because it's a bit tricky (or at least it was for
> me).
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

On a quick read there seems to be lots of stuff in here which belongs into
other patches, like error handling changes, debugfs changes. At least
since you claim that the mappable/fenceable stuff is tricky it seems to
not stick out that much ...

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c    | 53 ++++++++++++++++++++++------------
>  drivers/gpu/drm/i915/i915_drv.h        |  5 ++--
>  drivers/gpu/drm/i915/i915_gem.c        | 43 +++++++++++++++++----------
>  drivers/gpu/drm/i915/i915_gem_evict.c  | 14 ++++-----
>  drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c  | 37 ++++++++++++++----------
>  6 files changed, 93 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index f8e590f..0b7df6c 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -144,7 +144,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	size_t total_obj_size, total_gtt_size;
>  	int count, ret;
>  
> @@ -152,6 +152,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	if (ret)
>  		return ret;
>  
> +	/* FIXME: the user of this interface might want more than just GGTT */
>  	switch (list) {
>  	case ACTIVE_LIST:
>  		seq_puts(m, "Active:\n");
> @@ -167,12 +168,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	}
>  
>  	total_obj_size = total_gtt_size = count = 0;
> -	list_for_each_entry(obj, head, mm_list) {
> -		seq_puts(m, "   ");
> -		describe_obj(m, obj);
> -		seq_putc(m, '\n');
> -		total_obj_size += obj->base.size;
> -		total_gtt_size += i915_gem_obj_ggtt_size(obj);
> +	list_for_each_entry(vma, head, mm_list) {
> +		seq_printf(m, "   ");
> +		describe_obj(m, vma->obj);
> +		seq_printf(m, "\n");
> +		total_obj_size += vma->obj->base.size;
> +		total_gtt_size += i915_gem_obj_size(vma->obj, vma->vm);
>  		count++;
>  	}
>  	mutex_unlock(&dev->struct_mutex);
> @@ -220,7 +221,18 @@ static int per_file_stats(int id, void *ptr, void *data)
>  	return 0;
>  }
>  
> -static int i915_gem_object_info(struct seq_file *m, void *data)
> +#define count_vmas(list, member) do { \
> +	list_for_each_entry(vma, list, member) { \
> +		size += i915_gem_obj_ggtt_size(vma->obj); \
> +		++count; \
> +		if (vma->obj->map_and_fenceable) { \
> +			mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
> +			++mappable_count; \
> +		} \
> +	} \
> +} while (0)
> +
> +static int i915_gem_object_info(struct seq_file *m, void* data)
>  {
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
> @@ -230,6 +242,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
>  	struct drm_i915_gem_object *obj;
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct drm_file *file;
> +	struct i915_vma *vma;
>  	int ret;
>  
>  	ret = mutex_lock_interruptible(&dev->struct_mutex);
> @@ -249,12 +262,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
>  		   count, mappable_count, size, mappable_size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> -	count_objects(&vm->active_list, mm_list);
> +	count_vmas(&vm->active_list, mm_list);
>  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> -	count_objects(&vm->inactive_list, mm_list);
> +	count_vmas(&vm->inactive_list, mm_list);
>  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
> @@ -1767,7 +1780,8 @@ i915_drop_caches_set(void *data, u64 val)
>  	struct drm_device *dev = data;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_object *obj, *next;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma, *x;
>  	int ret;
>  
>  	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
> @@ -1788,14 +1802,15 @@ i915_drop_caches_set(void *data, u64 val)
>  		i915_gem_retire_requests(dev);
>  
>  	if (val & DROP_BOUND) {
> -		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -					 mm_list) {
> -			if (obj->pin_count)
> -				continue;
> -
> -			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> -			if (ret)
> -				goto unlock;
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +			list_for_each_entry_safe(vma, x, &vm->inactive_list,
> +						 mm_list)
> +				if (vma->obj->pin_count == 0) {
> +					ret = i915_gem_object_unbind(vma->obj,
> +								     vm);
> +					if (ret)
> +						goto unlock;
> +				}
>  		}
>  	}
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 681cb41..b208c30 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -541,6 +541,9 @@ struct i915_vma {
>  	struct drm_i915_gem_object *obj;
>  	struct i915_address_space *vm;
>  
> +	/** This object's place on the active/inactive lists */
> +	struct list_head mm_list;
> +
>  	struct list_head vma_link; /* Link in the object's VMA list */
>  };
>  
> @@ -1258,9 +1261,7 @@ struct drm_i915_gem_object {
>  	struct drm_mm_node *stolen;
>  	struct list_head global_list;
>  
> -	/** This object's place on the active/inactive lists */
>  	struct list_head ring_list;
> -	struct list_head mm_list;
>  	/** This object's place in the batchbuffer or on the eviction list */
>  	struct list_head exec_list;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0111554..6bdf89d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1874,6 +1874,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	u32 seqno = intel_ring_get_seqno(ring);
> +	struct i915_vma *vma;
>  
>  	BUG_ON(ring == NULL);
>  	if (obj->ring != ring && obj->last_write_seqno) {
> @@ -1889,7 +1890,8 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  	}
>  
>  	/* Move from whatever list we were on to the tail of execution. */
> -	list_move_tail(&obj->mm_list, &vm->active_list);
> +	vma = i915_gem_obj_to_vma(obj, vm);
> +	list_move_tail(&vma->mm_list, &vm->active_list);
>  	list_move_tail(&obj->ring_list, &ring->active_list);
>  
>  	obj->last_read_seqno = seqno;
> @@ -1912,10 +1914,13 @@ static void
>  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
>  				 struct i915_address_space *vm)
>  {
> +	struct i915_vma *vma;
> +
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> -	list_move_tail(&obj->mm_list, &vm->inactive_list);
> +	vma = i915_gem_obj_to_vma(obj, vm);
> +	list_move_tail(&vma->mm_list, &vm->inactive_list);
>  
>  	list_del_init(&obj->ring_list);
>  	obj->ring = NULL;
> @@ -2285,9 +2290,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm;
> -	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma;
>  	int i;
>  
>  	for_each_ring(ring, dev_priv, i)
> @@ -2297,8 +2302,8 @@ void i915_gem_reset(struct drm_device *dev)
>  	 * necessary invalidation upon reuse.
>  	 */
>  	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> -		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +		list_for_each_entry(vma, &vm->inactive_list, mm_list)
> +			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>  
>  	i915_gem_restore_fences(dev);
>  }
> @@ -2633,7 +2638,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
>  
>  	trace_i915_gem_object_unbind(obj, vm);
>  
> -	if (obj->has_global_gtt_mapping)
> +	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
>  		i915_gem_gtt_unbind_object(obj);
>  	if (obj->has_aliasing_ppgtt_mapping) {
>  		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> @@ -2642,11 +2647,12 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
>  	i915_gem_gtt_finish_object(obj);
>  	i915_gem_object_unpin_pages(obj);
>  
> -	list_del(&obj->mm_list);
>  	/* Avoid an unnecessary call to unbind on rebind. */
> -	obj->map_and_fenceable = true;
> +	if (i915_is_ggtt(vm))
> +		obj->map_and_fenceable = true;
>  
>  	vma = i915_gem_obj_to_vma(obj, vm);
> +	list_del(&vma->mm_list);
>  	list_del(&vma->vma_link);
>  	drm_mm_remove_node(&vma->node);
>  	i915_gem_vma_destroy(vma);
> @@ -3171,7 +3177,7 @@ search_free:
>  		goto err_out;
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> +	list_add_tail(&vma->mm_list, &vm->inactive_list);
>  
>  	/* Keep GGTT vmas first to make debug easier */
>  	if (i915_is_ggtt(vm))
> @@ -3188,7 +3194,9 @@ search_free:
>  		i915_is_ggtt(vm) &&
>  		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>  
> -	obj->map_and_fenceable = mappable && fenceable;
> +	/* Map and fenceable only changes if the VM is the global GGTT */
> +	if (i915_is_ggtt(vm))
> +		obj->map_and_fenceable = mappable && fenceable;
>  
>  	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
>  	i915_gem_verify_gtt(dev);
> @@ -3332,9 +3340,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  					    old_write_domain);
>  
>  	/* And bump the LRU for this access */
> -	if (i915_gem_object_is_inactive(obj))
> -		list_move_tail(&obj->mm_list,
> -			       &dev_priv->gtt.base.inactive_list);
> +	if (i915_gem_object_is_inactive(obj)) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> +							   &dev_priv->gtt.base);
> +		if (vma)
> +			list_move_tail(&vma->mm_list,
> +				       &dev_priv->gtt.base.inactive_list);
> +
> +	}
>  
>  	return 0;
>  }
> @@ -3906,7 +3919,6 @@ unlock:
>  void i915_gem_object_init(struct drm_i915_gem_object *obj,
>  			  const struct drm_i915_gem_object_ops *ops)
>  {
> -	INIT_LIST_HEAD(&obj->mm_list);
>  	INIT_LIST_HEAD(&obj->global_list);
>  	INIT_LIST_HEAD(&obj->ring_list);
>  	INIT_LIST_HEAD(&obj->exec_list);
> @@ -4043,6 +4055,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  		return ERR_PTR(-ENOMEM);
>  
>  	INIT_LIST_HEAD(&vma->vma_link);
> +	INIT_LIST_HEAD(&vma->mm_list);
>  	vma->vm = vm;
>  	vma->obj = obj;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 32efdc0..18a44a9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -87,8 +87,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
>  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
>  		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
> @@ -97,8 +96,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
>  		goto none;
>  
>  	/* Now merge in the soon-to-be-expired objects... */
> -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +	list_for_each_entry(vma, &vm->active_list, mm_list) {
>  		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
> @@ -159,7 +157,7 @@ i915_gem_evict_everything(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct i915_address_space *vm;
> -	struct drm_i915_gem_object *obj, *next;
> +	struct i915_vma *vma, *next;
>  	bool lists_empty = true;
>  	int ret;
>  
> @@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
>  
>  	/* Having flushed everything, unbind() should never raise an error */
>  	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> -		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -			if (obj->pin_count == 0)
> -				WARN_ON(i915_gem_object_unbind(obj, vm));
> +		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
> +			if (vma->obj->pin_count == 0)
> +				WARN_ON(i915_gem_object_unbind(vma->obj, vm));
>  	}
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 000ffbd..fa60103 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	obj->has_global_gtt_mapping = 1;
>  
>  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
> +	list_add_tail(&vma->mm_list, &ggtt->inactive_list);
>  
>  	return obj;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index d970d84..9623a4e 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -556,11 +556,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
>  static u32 capture_active_bo(struct drm_i915_error_buffer *err,
>  			     int count, struct list_head *head)
>  {
> -	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
>  	int i = 0;
>  
> -	list_for_each_entry(obj, head, mm_list) {
> -		capture_bo(err++, obj);
> +	list_for_each_entry(vma, head, mm_list) {
> +		capture_bo(err++, vma->obj);
>  		if (++i == count)
>  			break;
>  	}
> @@ -622,7 +622,8 @@ static struct drm_i915_error_object *
>  i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  			     struct intel_ring_buffer *ring)
>  {
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
> +	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
>  	u32 seqno;
>  
> @@ -642,20 +643,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  	}
>  
>  	seqno = ring->get_seqno(ring, false);
> -	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		if (obj->ring != ring)
> -			continue;
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		list_for_each_entry(vma, &vm->active_list, mm_list) {
> +			obj = vma->obj;
> +			if (obj->ring != ring)
> +				continue;
>  
> -		if (i915_seqno_passed(seqno, obj->last_read_seqno))
> -			continue;
> +			if (i915_seqno_passed(seqno, obj->last_read_seqno))
> +				continue;
>  
> -		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> -			continue;
> +			if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
> +				continue;
>  
> -		/* We need to copy these to an anonymous buffer as the simplest
> -		 * method to avoid being overwritten by userspace.
> -		 */
> -		return i915_error_object_create(dev_priv, obj);
> +			/* We need to copy these to an anonymous buffer as the simplest
> +			 * method to avoid being overwritten by userspace.
> +			 */
> +			return i915_error_object_create(dev_priv, obj);
> +		}
>  	}
>  
>  	return NULL;
> @@ -775,11 +779,12 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
>  				     struct drm_i915_error_state *error)
>  {
>  	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
>  	int i;
>  
>  	i = 0;
> -	list_for_each_entry(obj, &vm->active_list, mm_list)
> +	list_for_each_entry(vma, &vm->active_list, mm_list)
>  		i++;
>  	error->active_bo_count = i;
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> -- 
> 1.8.3.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/12] drm/i915: Track active by VMA instead of object
  2013-07-22  2:08 ` [PATCH 04/12] drm/i915: Track active by VMA instead of object Ben Widawsky
@ 2013-07-23 16:48   ` Daniel Vetter
  2013-07-26 21:48     ` Ben Widawsky
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Vetter @ 2013-07-23 16:48 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sun, Jul 21, 2013 at 07:08:11PM -0700, Ben Widawsky wrote:
> Even though we want to be able to track active by VMA, the rest of the
> code is still using objects for most internal APIs. To solve this,
> create an object_is_active() function to help us in converting over to
> VMA usage.
> 
> Because we intend to keep around some functions that care about objects,
> and not VMAs, having this function around will be useful even as we
> begin to use VMAs more in function arguments.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Still not really convinced. For access synchronization we don't care
through which vm a bo is still access, only how (read/write) and when was
the last access (ring + seqno).

Note that this means that the per-vm lru doesn't really need an
active/inactive split anymore, for evict_something we only care about the
ordering and not whether a bo is active or not. unbind() will care but I'm
not sure that the "same bo in multiple address spaces needs to be evicted"
use-case is something we even should care about.

So imo this commit needs a good justificatio for _why_ we want to track
active per-vma. Atm I don't see a use-case, but I see complexity.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h            | 15 +++----
>  drivers/gpu/drm/i915/i915_gem.c            | 64 ++++++++++++++++++------------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
>  3 files changed, 48 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f809204..bdce9c1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -541,6 +541,13 @@ struct i915_vma {
>  	struct drm_i915_gem_object *obj;
>  	struct i915_address_space *vm;
>  
> +	/**
> +	 * This is set if the object is on the active lists (has pending
> +	 * rendering and so a non-zero seqno), and is not set if it i s on
> +	 * inactive (ready to be unbound) list.
> +	 */
> +	unsigned int active:1;
> +
>  	/** This object's place on the active/inactive lists */
>  	struct list_head mm_list;
>  
> @@ -1266,13 +1273,6 @@ struct drm_i915_gem_object {
>  	struct list_head exec_list;
>  
>  	/**
> -	 * This is set if the object is on the active lists (has pending
> -	 * rendering and so a non-zero seqno), and is not set if it i s on
> -	 * inactive (ready to be unbound) list.
> -	 */
> -	unsigned int active:1;
> -
> -	/**
>  	 * This is set if the object has been written to since last bound
>  	 * to the GTT
>  	 */
> @@ -1726,6 +1726,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  			 struct intel_ring_buffer *to);
> +bool i915_gem_object_is_active(struct drm_i915_gem_object *obj);
>  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  				    struct i915_address_space *vm,
>  				    struct intel_ring_buffer *ring);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6bdf89d..9ea6424 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -119,10 +119,22 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  	return 0;
>  }
>  
> +/* NB: Not the same as !i915_gem_object_is_inactive */
> +bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
> +{
> +	struct i915_vma *vma;
> +
> +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> +		if (vma->active)
> +			return true;
> +
> +	return false;
> +}
> +
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -	return i915_gem_obj_bound_any(obj) && !obj->active;
> +	return i915_gem_obj_bound_any(obj) && !i915_gem_object_is_active(obj);
>  }
>  
>  int
> @@ -1883,14 +1895,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  	}
>  	obj->ring = ring;
>  
> +	/* Move from whatever list we were on to the tail of execution. */
> +	vma = i915_gem_obj_to_vma(obj, vm);
>  	/* Add a reference if we're newly entering the active list. */
> -	if (!obj->active) {
> +	if (!vma->active) {
>  		drm_gem_object_reference(&obj->base);
> -		obj->active = 1;
> +		vma->active = 1;
>  	}
>  
> -	/* Move from whatever list we were on to the tail of execution. */
> -	vma = i915_gem_obj_to_vma(obj, vm);
>  	list_move_tail(&vma->mm_list, &vm->active_list);
>  	list_move_tail(&obj->ring_list, &ring->active_list);
>  
> @@ -1911,16 +1923,23 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>  
>  static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> -				 struct i915_address_space *vm)
> +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  {
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> +	struct i915_address_space *vm;
>  	struct i915_vma *vma;
> +	int i = 0;
>  
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> -	BUG_ON(!obj->active);
>  
> -	vma = i915_gem_obj_to_vma(obj, vm);
> -	list_move_tail(&vma->mm_list, &vm->inactive_list);
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		vma = i915_gem_obj_to_vma(obj, vm);
> +		if (!vma || !vma->active)
> +			continue;
> +		list_move_tail(&vma->mm_list, &vm->inactive_list);
> +		vma->active = 0;
> +		i++;
> +	}
>  
>  	list_del_init(&obj->ring_list);
>  	obj->ring = NULL;
> @@ -1932,8 +1951,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
>  	obj->last_fenced_seqno = 0;
>  	obj->fenced_gpu_access = false;
>  
> -	obj->active = 0;
> -	drm_gem_object_unreference(&obj->base);
> +	while (i--)
> +		drm_gem_object_unreference(&obj->base);
>  
>  	WARN_ON(i915_verify_lists(dev));
>  }
> @@ -2254,15 +2273,13 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
>  	}
>  
>  	while (!list_empty(&ring->active_list)) {
> -		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
>  				       struct drm_i915_gem_object,
>  				       ring_list);
>  
> -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> -			i915_gem_object_move_to_inactive(obj, vm);
> +		i915_gem_object_move_to_inactive(obj);
>  	}
>  }
>  
> @@ -2348,8 +2365,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
>  	 */
>  	while (!list_empty(&ring->active_list)) {
> -		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> -		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
> @@ -2359,8 +2374,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>  			break;
>  
> -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> -			i915_gem_object_move_to_inactive(obj, vm);
> +		BUG_ON(!i915_gem_object_is_active(obj));
> +		i915_gem_object_move_to_inactive(obj);
>  	}
>  
>  	if (unlikely(ring->trace_irq_seqno &&
> @@ -2435,7 +2450,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
>  {
>  	int ret;
>  
> -	if (obj->active) {
> +	if (i915_gem_object_is_active(obj)) {
>  		ret = i915_gem_check_olr(obj->ring, obj->last_read_seqno);
>  		if (ret)
>  			return ret;
> @@ -2500,7 +2515,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  	if (ret)
>  		goto out;
>  
> -	if (obj->active) {
> +	if (i915_gem_object_is_active(obj)) {
>  		seqno = obj->last_read_seqno;
>  		ring = obj->ring;
>  	}
> @@ -3850,7 +3865,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>  	 */
>  	ret = i915_gem_object_flush_active(obj);
>  
> -	args->busy = obj->active;
> +	args->busy = i915_gem_object_is_active(obj);
>  	if (obj->ring) {
>  		BUILD_BUG_ON(I915_NUM_RINGS > 16);
>  		args->busy |= intel_ring_flag(obj->ring) << 16;
> @@ -4716,13 +4731,12 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  			cnt += obj->base.size >> PAGE_SHIFT;
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -		if (obj->active)
> +		if (i915_gem_object_is_active(obj))
>  			continue;
>  
>  		i915_gem_object_flush_gtt_write_domain(obj);
>  		i915_gem_object_flush_cpu_write_domain(obj);
> -		/* FIXME: Can't assume global gtt */
> -		i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base);
> +		i915_gem_object_move_to_inactive(obj);
>  
>  		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 819d8d8..8d2643b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -251,7 +251,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  	}
>  
>  	/* We can't wait for rendering with pagefaults disabled */
> -	if (obj->active && in_atomic())
> +	if (i915_gem_object_is_active(obj) && in_atomic())
>  		return -EFAULT;
>  
>  	reloc->delta += target_offset;
> -- 
> 1.8.3.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/12] drm/i915: Use the new vm [un]bind functions
  2013-07-22  2:08 ` [PATCH 06/12] drm/i915: Use the new vm [un]bind functions Ben Widawsky
@ 2013-07-23 16:54   ` Daniel Vetter
  2013-07-26 21:48     ` Ben Widawsky
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Vetter @ 2013-07-23 16:54 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sun, Jul 21, 2013 at 07:08:13PM -0700, Ben Widawsky wrote:
> Building on the last patch which created the new function pointers in
> the VM for bind/unbind, here we actually put those new function pointers
> to use.
> 
> Split out as a separate patch to aid in review. I'm fine with squashing
> into the previous patch if people request it.
> 
> v2: Updated to address the smart ggtt which can do aliasing as needed
> Make sure we bind to global gtt when mappable and fenceable. I thought
> we could get away without this initialy, but we cannot.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Meta review on the patch split: If you create new functions in a prep
patch, then switch and then kill the old functions it's much harder to
review whether any unwanted functional changes have been introduced.
Reviewers have to essentially keep both the old and new code open and
compare by hand.  And generally the really hard regression in gem have
been due to such deeply-hidden accidental changes, and we frankly don't
yet have the test coverage to just gloss over this.

If you instead first prepare the existing functions by changing the
arguments and logic, and then once everything is in place switch over to
vfuncs in the 2nd patch changes will be in-place. In-place changes are
much easier to review since diff compresses away unchanged parts.

Second reason for this approach is that the functions stay at the same
place in the source code file, which reduces the amount of spurious
conflicts when rebasing a large set of patches around such changes ...

I need to ponder this more.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h            | 10 ------
>  drivers/gpu/drm/i915/i915_gem.c            | 37 +++++++++------------
>  drivers/gpu/drm/i915/i915_gem_context.c    |  7 ++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 ++++++++--------
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 53 ++----------------------------
>  5 files changed, 37 insertions(+), 99 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f3f2825..8d6aa34 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1933,18 +1933,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>  
>  /* i915_gem_gtt.c */
>  void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
> -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> -			    struct drm_i915_gem_object *obj,
> -			    enum i915_cache_level cache_level);
> -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> -			      struct drm_i915_gem_object *obj);
> -
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> -/* FIXME: this is never okay with full PPGTT */
> -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> -				enum i915_cache_level cache_level);
> -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
>  void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
>  void i915_gem_init_global_gtt(struct drm_device *dev);
>  void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9ea6424..63297d7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2653,12 +2653,9 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
>  
>  	trace_i915_gem_object_unbind(obj, vm);
>  
> -	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
> -		i915_gem_gtt_unbind_object(obj);
> -	if (obj->has_aliasing_ppgtt_mapping) {
> -		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> -		obj->has_aliasing_ppgtt_mapping = 0;
> -	}
> +	vma = i915_gem_obj_to_vma(obj, vm);
> +	vm->unmap_vma(vma);
> +
>  	i915_gem_gtt_finish_object(obj);
>  	i915_gem_object_unpin_pages(obj);
>  
> @@ -2666,7 +2663,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
>  	if (i915_is_ggtt(vm))
>  		obj->map_and_fenceable = true;
>  
> -	vma = i915_gem_obj_to_vma(obj, vm);
>  	list_del(&vma->mm_list);
>  	list_del(&vma->vma_link);
>  	drm_mm_remove_node(&vma->node);
> @@ -3372,7 +3368,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  				    enum i915_cache_level cache_level)
>  {
>  	struct drm_device *dev = obj->base.dev;
> -	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
>  	int ret;
>  
> @@ -3407,13 +3402,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  				return ret;
>  		}
>  
> -		if (obj->has_global_gtt_mapping)
> -			i915_gem_gtt_bind_object(obj, cache_level);
> -		if (obj->has_aliasing_ppgtt_mapping)
> -			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> -					       obj, cache_level);
> -
> -		i915_gem_obj_set_color(obj, vma->vm, cache_level);
> +		vm->map_vma(vma, cache_level, 0);
> +		i915_gem_obj_set_color(obj, vm, cache_level);
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -3695,6 +3685,8 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  		    bool map_and_fenceable,
>  		    bool nonblocking)
>  {
> +	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
> +	struct i915_vma *vma;
>  	int ret;
>  
>  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> @@ -3702,6 +3694,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  
>  	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
>  
> +	/* FIXME: Use vma for bounds check */
>  	if (i915_gem_obj_bound(obj, vm)) {
>  		if ((alignment &&
>  		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> @@ -3720,20 +3713,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	}
>  
>  	if (!i915_gem_obj_bound(obj, vm)) {
> -		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> -
>  		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
>  						 map_and_fenceable,
>  						 nonblocking);
>  		if (ret)
>  			return ret;
>  
> -		if (!dev_priv->mm.aliasing_ppgtt)
> -			i915_gem_gtt_bind_object(obj, obj->cache_level);
> -	}
> +		vma = i915_gem_obj_to_vma(obj, vm);
> +		vm->map_vma(vma, obj->cache_level, flags);
> +	} else
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  
> +	/* Objects are created map and fenceable. If we bind an object
> +	 * the first time, and we had aliasing PPGTT (and didn't request
> +	 * GLOBAL), we'll need to do this on the second bind.*/
>  	if (!obj->has_global_gtt_mapping && map_and_fenceable)
> -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> +		vm->map_vma(vma, obj->cache_level, GLOBAL_BIND);
>  
>  	obj->pin_count++;
>  	obj->pin_mappable |= map_and_fenceable;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 873577d..cc7c0b4 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -417,8 +417,11 @@ static int do_switch(struct i915_hw_context *to)
>  		return ret;
>  	}
>  
> -	if (!to->obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
> +	if (!to->obj->has_global_gtt_mapping) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
> +							   &dev_priv->gtt.base);
> +		vma->vm->map_vma(vma, to->obj->cache_level, GLOBAL_BIND);
> +	}
>  
>  	if (!to->is_initialized || is_default_context(to))
>  		hw_flags |= MI_RESTORE_INHIBIT;
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 8d2643b..6359ef2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -197,8 +197,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  	if (unlikely(IS_GEN6(dev) &&
>  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
>  	    !target_i915_obj->has_global_gtt_mapping)) {
> -		i915_gem_gtt_bind_object(target_i915_obj,
> -					 target_i915_obj->cache_level);
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		vma->vm->map_vma(vma, target_i915_obj->cache_level,
> +				 GLOBAL_BIND);
>  	}
>  
>  	/* Validate that the target is in a valid r/w GPU domain */
> @@ -404,10 +405,12 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  				   struct i915_address_space *vm,
>  				   bool *need_reloc)
>  {
> -	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
>  	bool need_fence, need_mappable;
> +	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
> +		!obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
> +	struct i915_vma *vma;
>  	int ret;
>  
>  	need_fence =
> @@ -421,6 +424,7 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  	if (ret)
>  		return ret;
>  
> +	vma = i915_gem_obj_to_vma(obj, vm);
>  	entry->flags |= __EXEC_OBJECT_HAS_PIN;
>  
>  	if (has_fenced_gpu_access) {
> @@ -436,14 +440,6 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		}
>  	}
>  
> -	/* Ensure ppgtt mapping exists if needed */
> -	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> -		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> -				       obj, obj->cache_level);
> -
> -		obj->has_aliasing_ppgtt_mapping = 1;
> -	}
> -
>  	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
>  		entry->offset = i915_gem_obj_offset(obj, vm);
>  		*need_reloc = true;
> @@ -454,9 +450,7 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
>  	}
>  
> -	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
> -	    !obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> +	vm->map_vma(vma, obj->cache_level, flags);
>  
>  	return 0;
>  }
> @@ -1047,8 +1041,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
>  	 * hsw should have this fixed, but let's be paranoid and do it
>  	 * unconditionally for now. */
> -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> +	if (flags & I915_DISPATCH_SECURE &&
> +	    !batch_obj->has_global_gtt_mapping) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
> +		vm->map_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> +	}
>  
>  	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->objects);
>  	if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 03e6179..1de49a0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -414,18 +414,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
>  	dev_priv->mm.aliasing_ppgtt = NULL;
>  }
>  
> -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> -			    struct drm_i915_gem_object *obj,
> -			    enum i915_cache_level cache_level)
> -{
> -	struct i915_address_space *vm = &ppgtt->base;
> -	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> -
> -	vm->insert_entries(vm, obj->pages,
> -			   obj_offset >> PAGE_SHIFT,
> -			   cache_level);
> -}
> -
>  static void __always_unused gen6_ppgtt_map_vma(struct i915_vma *vma,
>  					       enum i915_cache_level cache_level,
>  					       u32 flags)
> @@ -437,16 +425,6 @@ static void __always_unused gen6_ppgtt_map_vma(struct i915_vma *vma,
>  	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
>  }
>  
> -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> -			      struct drm_i915_gem_object *obj)
> -{
> -	struct i915_address_space *vm = &ppgtt->base;
> -	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> -
> -	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> -			obj->base.size >> PAGE_SHIFT);
> -}
> -
>  static void __always_unused gen6_ppgtt_unmap_vma(struct i915_vma *vma)
>  {
>  	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> @@ -507,8 +485,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> +							   &dev_priv->gtt.base);
>  		i915_gem_clflush_object(obj);
> -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> +		vma->vm->map_vma(vma, obj->cache_level, 0);
>  	}
>  
>  	i915_gem_chipset_flush(dev);
> @@ -664,33 +644,6 @@ static void gen6_ggtt_map_vma(struct i915_vma *vma,
>  	}
>  }
>  
> -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> -			      enum i915_cache_level cache_level)
> -{
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> -
> -	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
> -					  entry,
> -					  cache_level);
> -
> -	obj->has_global_gtt_mapping = 1;
> -}
> -
> -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> -{
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> -
> -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -				       entry,
> -				       obj->base.size >> PAGE_SHIFT);
> -
> -	obj->has_global_gtt_mapping = 0;
> -}
> -
>  static void gen6_ggtt_unmap_vma(struct i915_vma *vma)
>  {
>  	struct drm_device *dev = vma->vm->dev;
> -- 
> 1.8.3.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/12] drm/i915: eliminate vm->insert_entries()
  2013-07-22  2:08 ` [PATCH 07/12] drm/i915: eliminate vm->insert_entries() Ben Widawsky
@ 2013-07-23 16:57   ` Daniel Vetter
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-07-23 16:57 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sun, Jul 21, 2013 at 07:08:14PM -0700, Ben Widawsky wrote:
> With bind/unbind function pointers in place, we no longer need
> insert_entries. We could, and want, to remove clear_range, however it's
> not totally easy at this point. Since it's used in a couple of place
> still that don't only deal in objects: setup, ppgtt init, and restore
> gtt mappings.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Oh, ->insert_entries has become defunct in the previous patches. I didn't
spot this ... Not great for rebasing -internal since such changes pretty
much guarantee that I botch the rebase. And I think we want to keep these
interfaces here.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ----------------
>  1 file changed, 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 1de49a0..5c04887 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -315,7 +315,6 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  	ppgtt->base.unmap_vma = NULL;
>  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
>  	ppgtt->base.map_vma = NULL;
> -	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
>  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
>  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
>  	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
> @@ -570,19 +569,6 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>  	readl(gtt_base);
>  }
>  
> -
> -static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> -				     struct sg_table *st,
> -				     unsigned int pg_start,
> -				     enum i915_cache_level cache_level)
> -{
> -	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
> -		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
> -
> -	intel_gtt_insert_sg_entries(st, pg_start, flags);
> -
> -}
> -
>  static void i915_ggtt_map_vma(struct i915_vma *vma,
>  			      enum i915_cache_level cache_level,
>  			      u32 unused)
> @@ -895,7 +881,6 @@ static int gen6_gmch_probe(struct drm_device *dev,
>  
>  	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
>  	dev_priv->gtt.base.unmap_vma = gen6_ggtt_unmap_vma;
> -	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
>  	dev_priv->gtt.base.map_vma = gen6_ggtt_map_vma;
>  
>  	return ret;
> @@ -929,7 +914,6 @@ static int i915_gmch_probe(struct drm_device *dev,
>  	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
>  	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
>  	dev_priv->gtt.base.unmap_vma = i915_ggtt_unmap_vma;
> -	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
>  	dev_priv->gtt.base.map_vma = i915_ggtt_map_vma;
>  
>  	return 0;
> -- 
> 1.8.3.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/12] drm/i915: Convert object coloring to VMA
  2013-07-22  2:08 ` [PATCH 11/12] drm/i915: Convert object coloring to VMA Ben Widawsky
@ 2013-07-23 17:07   ` Daniel Vetter
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-07-23 17:07 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sun, Jul 21, 2013 at 07:08:18PM -0700, Ben Widawsky wrote:
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Oh, here's the patch I've been looking for in patch 1 ;-)

I think if you split up patch 1 into different pieces _without_ changing
anything in the aggregate diff (see my little howto on our internal wiki)
then I guess I can be appeased to merge stuff as-is, or suggest to squash
in individual fixups like this one here.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h |  3 ---
>  drivers/gpu/drm/i915/i915_gem.c | 18 +-----------------
>  2 files changed, 1 insertion(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index fe41a3d..2b4f30c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1864,9 +1864,6 @@ bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
>  			struct i915_address_space *vm);
>  unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
>  				struct i915_address_space *vm);
> -void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> -			    struct i915_address_space *vm,
> -			    enum i915_cache_level color);
>  struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
>  				     struct i915_address_space *vm);
>  struct i915_vma *
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 397a4b4..e038709 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3394,7 +3394,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  		}
>  
>  		vm->map_vma(vma, cache_level, 0);
> -		i915_gem_obj_set_color(obj, vm, cache_level);
> +		vma->node.color = cache_level;
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -4800,22 +4800,6 @@ unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
>  	return 0;
>  }
>  
> -void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> -			    struct i915_address_space *vm,
> -			    enum i915_cache_level color)
> -{
> -	struct i915_vma *vma;
> -	BUG_ON(list_empty(&o->vma_list));
> -	list_for_each_entry(vma, &o->vma_list, vma_link) {
> -		if (vma->vm == vm) {
> -			vma->node.color = color;
> -			return;
> -		}
> -	}
> -
> -	WARN(1, "Couldn't set color for VM %p\n", vm);
> -}
> -
>  struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
>  				     struct i915_address_space *vm)
>  {
> -- 
> 1.8.3.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/12] drm/i915: Fix up map and fenceable for VMA
  2013-07-23 16:42   ` Daniel Vetter
@ 2013-07-23 18:14     ` Ben Widawsky
  0 siblings, 0 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-23 18:14 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 23, 2013 at 06:42:35PM +0200, Daniel Vetter wrote:
> On Sun, Jul 21, 2013 at 07:08:09PM -0700, Ben Widawsky wrote:
> > formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
> > tracking"
> > 
> > The map_and_fenceable tracking is per object. GTT mapping, and fences
> > only apply to global GTT. As such,  object operations which are not
> > performed on the global GTT should not effect mappable or fenceable
> > characteristics.
> > 
> > Functionally, this commit could very well be squashed in to the previous
> > patch which updated object operations to take a VM argument.  This
> > commit is split out because it's a bit tricky (or at least it was for
> > me).
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> On a quick read there seems to be lots of stuff in here which belongs into
> other patches, like error handling changes, debugfs changes. At least
> since you claim that the mappable/fenceable stuff is tricky it seems to
> not stick out that much ...

Yep. I can't remember if I intentionally squashed this, and didn't
update the commit, or accidentally squashed it. In either case, you're
right.
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-22  2:08 ` [PATCH 01/12] drm/i915: plumb VM into object operations Ben Widawsky
  2013-07-23 16:37   ` Daniel Vetter
@ 2013-07-26  9:51   ` Daniel Vetter
  2013-07-26 16:59     ` Jesse Barnes
  2013-07-26 20:15     ` Ben Widawsky
  1 sibling, 2 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-07-26  9:51 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

HI all,

So Ben&I had a bit a private discussion and one thing I've explained a bit
more in detail is what kind of review I'm doing as maintainer. I've
figured this is generally useful. We've also discussed a bit that for
developers without their own lab it would be nice if QA could test random
branches on their set of machines. But imo that'll take quite a while,
there's lots of other stuff to improve in QA land first. Anyway, here's
it:

Now an explanation for why this freaked me out, which is essentially an
explanation of what I do when I do maintainer reviews:

Probably the most important question I ask myself when reading a patch is
"if a regression would bisect to this, and the bisect is the only useful
piece of evidence, would I stand a chance to understand it?".  Your patch
is big, has the appearance of doing a few unrelated things and could very
well hide a bug which would take me an awful lot of time to spot. So imo
the answer for your patch is a clear "no".

I've merged a few such patches in the past  where I've had a similar hunch
and regretted it almost always. I've also sometimes split-up the patch
while applying, but that approach doesn't scale any more with our rather
big team.

The second thing I try to figure out is whether the patch author is indeed
the local expert on the topic at hand now. With our team size and patch
flow I don't stand a chance if I try to understand everything to the last
detail. Instead I try to assess this through the proxy of convincing
myself the the patch submitter understands stuff much better than I do. I
tend to check that by asking random questions, proposing alternative
approaches and also by rating code/patch clarity. The obj_set_color
double-loop very much gave me the impression that you didn't have a clear
idea about how exactly this should work, so that  hunk trigger this
maintainer hunch.

I admit that this is all rather fluffy and very much an inexact science,
but it's the only tools I have as a maintainer. The alternative of doing
shit myself or checking everything myself in-depth just doesnt scale.

Cheers, Daniel


On Mon, Jul 22, 2013 at 4:08 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
>
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
>
> Some code will still need to be ported over after this.
>
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
>
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
>
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
>
> v5: Very large rebase
>
> v6:
> Change BUG_ON to WARN_ON (Daniel)
> Rename vm to ggtt in preallocate stolen, since it is always ggtt when
> dealing with stolen memory. (Daniel)
> list_for_each will short-circuit already (Daniel)
> remove superflous space (Daniel)
> Use per object list of vmas (Daniel)
> Make obj_bound_any() use obj_bound for each vm (Ben)
> s/bind_to_gtt/bind_to_vm/ (Ben)
>
> Fixed up the inactive shrinker. As Daniel noticed the code could
> potentially count the same object multiple times. While it's not
> possible in the current case, since 1 object can only ever be bound into
> 1 address space thus far - we may as well try to get something more
> future proof in place now. With a prep patch before this to switch over
> to using the bound list + inactive check, we're now able to carry that
> forward for every address space an object is bound into.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  29 ++-
>  drivers/gpu/drm/i915/i915_dma.c            |   4 -
>  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++----
>  drivers/gpu/drm/i915/i915_gem.c            | 337 +++++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |  10 +-
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
>  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
>  drivers/gpu/drm/i915/intel_fb.c            |   1 -
>  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
>  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
>  15 files changed, 479 insertions(+), 245 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index be69807..f8e590f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -92,6 +92,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj)
>  static void
>  describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  {
> +       struct i915_vma *vma;
>         seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s",
>                    &obj->base,
>                    get_pin_flag(obj),
> @@ -111,9 +112,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>                 seq_printf(m, " (pinned x %d)", obj->pin_count);
>         if (obj->fence_reg != I915_FENCE_REG_NONE)
>                 seq_printf(m, " (fence: %d)", obj->fence_reg);
> -       if (i915_gem_obj_ggtt_bound(obj))
> -               seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> -                          i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> +       list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +               if (!i915_is_ggtt(vma->vm))
> +                       seq_puts(m, " (pp");
> +               else
> +                       seq_puts(m, " (g");
> +               seq_printf(m, "gtt offset: %08lx, size: %08lx)",
> +                          i915_gem_obj_offset(obj, vma->vm),
> +                          i915_gem_obj_size(obj, vma->vm));
> +       }
>         if (obj->stolen)
>                 seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
>         if (obj->pin_mappable || obj->fault_mappable) {
> @@ -175,6 +182,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>         return 0;
>  }
>
> +/* FIXME: Support multiple VM? */
>  #define count_objects(list, member) do { \
>         list_for_each_entry(obj, list, member) { \
>                 size += i915_gem_obj_ggtt_size(obj); \
> @@ -1781,18 +1789,21 @@ i915_drop_caches_set(void *data, u64 val)
>
>         if (val & DROP_BOUND) {
>                 list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -                                        mm_list)
> -                       if (obj->pin_count == 0) {
> -                               ret = i915_gem_object_unbind(obj);
> -                               if (ret)
> -                                       goto unlock;
> -                       }
> +                                        mm_list) {
> +                       if (obj->pin_count)
> +                               continue;
> +
> +                       ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> +                       if (ret)
> +                               goto unlock;
> +               }
>         }
>
>         if (val & DROP_UNBOUND) {
>                 list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
>                                          global_list)
>                         if (obj->pages_pin_count == 0) {
> +                               /* FIXME: Do this for all vms? */
>                                 ret = i915_gem_object_put_pages(obj);
>                                 if (ret)
>                                         goto unlock;
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 1449d06..4650519 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>
>         i915_dump_device_info(dev_priv);
>
> -       INIT_LIST_HEAD(&dev_priv->vm_list);
> -       INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> -       list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> -
>         if (i915_get_bridge_dev(dev)) {
>                 ret = -EIO;
>                 goto free_priv;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8b3167e..681cb41 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1379,52 +1379,6 @@ struct drm_i915_gem_object {
>
>  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
>
> -/* This is a temporary define to help transition us to real VMAs. If you see
> - * this, you're either reviewing code, or bisecting it. */
> -static inline struct i915_vma *
> -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> -{
> -       if (list_empty(&obj->vma_list))
> -               return NULL;
> -       return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> -}
> -
> -/* Whether or not this object is currently mapped by the translation tables */
> -static inline bool
> -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> -{
> -       struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> -       if (vma == NULL)
> -               return false;
> -       return drm_mm_node_allocated(&vma->node);
> -}
> -
> -/* Offset of the first PTE pointing to this object */
> -static inline unsigned long
> -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> -{
> -       BUG_ON(list_empty(&o->vma_list));
> -       return __i915_gem_obj_to_vma(o)->node.start;
> -}
> -
> -/* The size used in the translation tables may be larger than the actual size of
> - * the object on GEN2/GEN3 because of the way tiling is handled. See
> - * i915_gem_get_gtt_size() for more details.
> - */
> -static inline unsigned long
> -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> -{
> -       BUG_ON(list_empty(&o->vma_list));
> -       return __i915_gem_obj_to_vma(o)->node.size;
> -}
> -
> -static inline void
> -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> -                           enum i915_cache_level color)
> -{
> -       __i915_gem_obj_to_vma(o)->node.color = color;
> -}
> -
>  /**
>   * Request queue structure.
>   *
> @@ -1736,11 +1690,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  void i915_gem_vma_destroy(struct i915_vma *vma);
>
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +                                    struct i915_address_space *vm,
>                                      uint32_t alignment,
>                                      bool map_and_fenceable,
>                                      bool nonblocking);
>  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +                                       struct i915_address_space *vm);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
>  void i915_gem_lastclose(struct drm_device *dev);
> @@ -1770,6 +1726,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>                          struct intel_ring_buffer *to);
>  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +                                   struct i915_address_space *vm,
>                                     struct intel_ring_buffer *ring);
>
>  int i915_gem_dumb_create(struct drm_file *file_priv,
> @@ -1876,6 +1833,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
>                             int tiling_mode, bool fenced);
>
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +                                   struct i915_address_space *vm,
>                                     enum i915_cache_level cache_level);
>
>  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> @@ -1886,6 +1844,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>
>  void i915_gem_restore_fences(struct drm_device *dev);
>
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +                                 struct i915_address_space *vm);
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +                       struct i915_address_space *vm);
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +                               struct i915_address_space *vm);
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +                           struct i915_address_space *vm,
> +                           enum i915_cache_level color);
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +                                    struct i915_address_space *vm);
> +/* Some GGTT VM helpers */
> +#define obj_to_ggtt(obj) \
> +       (&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> +{
> +       struct i915_address_space *ggtt =
> +               &((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> +       return vm == ggtt;
> +}
> +
> +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> +{
> +       return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> +{
> +       return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> +{
> +       return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline int __must_check
> +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> +                 uint32_t alignment,
> +                 bool map_and_fenceable,
> +                 bool nonblocking)
> +{
> +       return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> +                                  map_and_fenceable, nonblocking);
> +}
> +#undef obj_to_ggtt
> +
>  /* i915_gem_context.c */
>  void i915_gem_context_init(struct drm_device *dev);
>  void i915_gem_context_fini(struct drm_device *dev);
> @@ -1922,6 +1930,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> +/* FIXME: this is never okay with full PPGTT */
>  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>                                 enum i915_cache_level cache_level);
>  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> @@ -1938,7 +1947,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
>
>
>  /* i915_gem_evict.c */
> -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> +int __must_check i915_gem_evict_something(struct drm_device *dev,
> +                                         struct i915_address_space *vm,
> +                                         int min_size,
>                                           unsigned alignment,
>                                           unsigned cache_level,
>                                           bool mappable,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2283765..0111554 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -38,10 +38,12 @@
>
>  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
>  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -                                                   unsigned alignment,
> -                                                   bool map_and_fenceable,
> -                                                   bool nonblocking);
> +static __must_check int
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +                          struct i915_address_space *vm,
> +                          unsigned alignment,
> +                          bool map_and_fenceable,
> +                          bool nonblocking);
>  static int i915_gem_phys_pwrite(struct drm_device *dev,
>                                 struct drm_i915_gem_object *obj,
>                                 struct drm_i915_gem_pwrite *args,
> @@ -120,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -       return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> +       return i915_gem_obj_bound_any(obj) && !obj->active;
>  }
>
>  int
> @@ -406,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
>                  * anyway again before the next pread happens. */
>                 if (obj->cache_level == I915_CACHE_NONE)
>                         needs_clflush = 1;
> -               if (i915_gem_obj_ggtt_bound(obj)) {
> +               if (i915_gem_obj_bound_any(obj)) {
>                         ret = i915_gem_object_set_to_gtt_domain(obj, false);
>                         if (ret)
>                                 return ret;
> @@ -578,7 +580,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>         char __user *user_data;
>         int page_offset, page_length, ret;
>
> -       ret = i915_gem_object_pin(obj, 0, true, true);
> +       ret = i915_gem_ggtt_pin(obj, 0, true, true);
>         if (ret)
>                 goto out;
>
> @@ -723,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>                  * right away and we therefore have to clflush anyway. */
>                 if (obj->cache_level == I915_CACHE_NONE)
>                         needs_clflush_after = 1;
> -               if (i915_gem_obj_ggtt_bound(obj)) {
> +               if (i915_gem_obj_bound_any(obj)) {
>                         ret = i915_gem_object_set_to_gtt_domain(obj, true);
>                         if (ret)
>                                 return ret;
> @@ -1332,7 +1334,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>         }
>
>         /* Now bind it into the GTT if needed */
> -       ret = i915_gem_object_pin(obj, 0, true, false);
> +       ret = i915_gem_ggtt_pin(obj,  0, true, false);
>         if (ret)
>                 goto unlock;
>
> @@ -1654,11 +1656,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>         if (obj->pages == NULL)
>                 return 0;
>
> -       BUG_ON(i915_gem_obj_ggtt_bound(obj));
> -
>         if (obj->pages_pin_count)
>                 return -EBUSY;
>
> +       BUG_ON(i915_gem_obj_bound_any(obj));
> +
>         /* ->put_pages might need to allocate memory for the bit17 swizzle
>          * array, hence protect them from being reaped by removing them from gtt
>          * lists early. */
> @@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>                   bool purgeable_only)
>  {
>         struct drm_i915_gem_object *obj, *next;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         long count = 0;
>
>         list_for_each_entry_safe(obj, next,
> @@ -1692,14 +1693,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>                 }
>         }
>
> -       list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> -               if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> -                   i915_gem_object_unbind(obj) == 0 &&
> -                   i915_gem_object_put_pages(obj) == 0) {
> +       list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> +                                global_list) {
> +               struct i915_vma *vma, *v;
> +
> +               if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> +                       continue;
> +
> +               list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> +                       if (i915_gem_object_unbind(obj, vma->vm))
> +                               break;
> +
> +               if (!i915_gem_object_put_pages(obj))
>                         count += obj->base.size >> PAGE_SHIFT;
> -                       if (count >= target)
> -                               return count;
> -               }
> +
> +               if (count >= target)
> +                       return count;
>         }
>
>         return count;
> @@ -1859,11 +1868,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>
>  void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +                              struct i915_address_space *vm,
>                                struct intel_ring_buffer *ring)
>  {
>         struct drm_device *dev = obj->base.dev;
>         struct drm_i915_private *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         u32 seqno = intel_ring_get_seqno(ring);
>
>         BUG_ON(ring == NULL);
> @@ -1900,12 +1909,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>
>  static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> +                                struct i915_address_space *vm)
>  {
> -       struct drm_device *dev = obj->base.dev;
> -       struct drm_i915_private *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
> -
>         BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>         BUG_ON(!obj->active);
>
> @@ -2105,10 +2111,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
>         spin_unlock(&file_priv->mm.lock);
>  }
>
> -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> +                                   struct i915_address_space *vm)
>  {
> -       if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> -           acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> +       if (acthd >= i915_gem_obj_offset(obj, vm) &&
> +           acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
>                 return true;
>
>         return false;
> @@ -2131,6 +2138,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
>         return false;
>  }
>
> +static struct i915_address_space *
> +request_to_vm(struct drm_i915_gem_request *request)
> +{
> +       struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> +       struct i915_address_space *vm;
> +
> +       vm = &dev_priv->gtt.base;
> +
> +       return vm;
> +}
> +
>  static bool i915_request_guilty(struct drm_i915_gem_request *request,
>                                 const u32 acthd, bool *inside)
>  {
> @@ -2138,9 +2156,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
>          * pointing inside the ring, matches the batch_obj address range.
>          * However this is extremely unlikely.
>          */
> -
>         if (request->batch_obj) {
> -               if (i915_head_inside_object(acthd, request->batch_obj)) {
> +               if (i915_head_inside_object(acthd, request->batch_obj,
> +                                           request_to_vm(request))) {
>                         *inside = true;
>                         return true;
>                 }
> @@ -2160,17 +2178,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
>  {
>         struct i915_ctx_hang_stats *hs = NULL;
>         bool inside, guilty;
> +       unsigned long offset = 0;
>
>         /* Innocent until proven guilty */
>         guilty = false;
>
> +       if (request->batch_obj)
> +               offset = i915_gem_obj_offset(request->batch_obj,
> +                                            request_to_vm(request));
> +
>         if (ring->hangcheck.action != wait &&
>             i915_request_guilty(request, acthd, &inside)) {
>                 DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
>                           ring->name,
>                           inside ? "inside" : "flushing",
> -                         request->batch_obj ?
> -                         i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> +                         offset,
>                           request->ctx ? request->ctx->id : 0,
>                           acthd);
>
> @@ -2227,13 +2249,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
>         }
>
>         while (!list_empty(&ring->active_list)) {
> +               struct i915_address_space *vm;
>                 struct drm_i915_gem_object *obj;
>
>                 obj = list_first_entry(&ring->active_list,
>                                        struct drm_i915_gem_object,
>                                        ring_list);
>
> -               i915_gem_object_move_to_inactive(obj);
> +               list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +                       i915_gem_object_move_to_inactive(obj, vm);
>         }
>  }
>
> @@ -2261,7 +2285,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>         struct drm_i915_private *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
> +       struct i915_address_space *vm;
>         struct drm_i915_gem_object *obj;
>         struct intel_ring_buffer *ring;
>         int i;
> @@ -2272,8 +2296,9 @@ void i915_gem_reset(struct drm_device *dev)
>         /* Move everything out of the GPU domains to ensure we do any
>          * necessary invalidation upon reuse.
>          */
> -       list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -               obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +       list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +               list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +                       obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>
>         i915_gem_restore_fences(dev);
>  }
> @@ -2318,6 +2343,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>          * by the ringbuffer to the flushing/inactive lists as appropriate.
>          */
>         while (!list_empty(&ring->active_list)) {
> +               struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +               struct i915_address_space *vm;
>                 struct drm_i915_gem_object *obj;
>
>                 obj = list_first_entry(&ring->active_list,
> @@ -2327,7 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>                 if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>                         break;
>
> -               i915_gem_object_move_to_inactive(obj);
> +               list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +                       i915_gem_object_move_to_inactive(obj, vm);
>         }
>
>         if (unlikely(ring->trace_irq_seqno &&
> @@ -2573,13 +2601,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
>   * Unbinds an object from the GTT aperture.
>   */
>  int
> -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +                      struct i915_address_space *vm)
>  {
>         drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
>         struct i915_vma *vma;
>         int ret;
>
> -       if (!i915_gem_obj_ggtt_bound(obj))
> +       if (!i915_gem_obj_bound(obj, vm))
>                 return 0;
>
>         if (obj->pin_count)
> @@ -2602,7 +2631,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>         if (ret)
>                 return ret;
>
> -       trace_i915_gem_object_unbind(obj);
> +       trace_i915_gem_object_unbind(obj, vm);
>
>         if (obj->has_global_gtt_mapping)
>                 i915_gem_gtt_unbind_object(obj);
> @@ -2617,7 +2646,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>         /* Avoid an unnecessary call to unbind on rebind. */
>         obj->map_and_fenceable = true;
>
> -       vma = __i915_gem_obj_to_vma(obj);
> +       vma = i915_gem_obj_to_vma(obj, vm);
>         list_del(&vma->vma_link);
>         drm_mm_remove_node(&vma->node);
>         i915_gem_vma_destroy(vma);
> @@ -2764,6 +2793,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
>                      "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
>                      i915_gem_obj_ggtt_offset(obj), size);
>
> +
>                 pitch_val = obj->stride / 128;
>                 pitch_val = ffs(pitch_val) - 1;
>
> @@ -3049,24 +3079,26 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>   * Finds free space in the GTT aperture and binds the object there.
>   */
>  static int
> -i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -                           unsigned alignment,
> -                           bool map_and_fenceable,
> -                           bool nonblocking)
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +                          struct i915_address_space *vm,
> +                          unsigned alignment,
> +                          bool map_and_fenceable,
> +                          bool nonblocking)
>  {
>         struct drm_device *dev = obj->base.dev;
>         drm_i915_private_t *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         u32 size, fence_size, fence_alignment, unfenced_alignment;
>         bool mappable, fenceable;
> -       size_t gtt_max = map_and_fenceable ?
> -               dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> +       size_t gtt_max =
> +               map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
>         struct i915_vma *vma;
>         int ret;
>
>         if (WARN_ON(!list_empty(&obj->vma_list)))
>                 return -EBUSY;
>
> +       BUG_ON(!i915_is_ggtt(vm));
> +
>         fence_size = i915_gem_get_gtt_size(dev,
>                                            obj->base.size,
>                                            obj->tiling_mode);
> @@ -3105,19 +3137,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>
>         i915_gem_object_pin_pages(obj);
>
> -       vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +       /* For now we only ever use 1 vma per object */
> +       WARN_ON(!list_empty(&obj->vma_list));
> +
> +       vma = i915_gem_vma_create(obj, vm);
>         if (IS_ERR(vma)) {
>                 i915_gem_object_unpin_pages(obj);
>                 return PTR_ERR(vma);
>         }
>
>  search_free:
> -       ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> -                                                 &vma->node,
> +       ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>                                                   size, alignment,
>                                                   obj->cache_level, 0, gtt_max);
>         if (ret) {
> -               ret = i915_gem_evict_something(dev, size, alignment,
> +               ret = i915_gem_evict_something(dev, vm, size, alignment,
>                                                obj->cache_level,
>                                                map_and_fenceable,
>                                                nonblocking);
> @@ -3138,18 +3172,25 @@ search_free:
>
>         list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>         list_add_tail(&obj->mm_list, &vm->inactive_list);
> -       list_add(&vma->vma_link, &obj->vma_list);
> +
> +       /* Keep GGTT vmas first to make debug easier */
> +       if (i915_is_ggtt(vm))
> +               list_add(&vma->vma_link, &obj->vma_list);
> +       else
> +               list_add_tail(&vma->vma_link, &obj->vma_list);
>
>         fenceable =
> +               i915_is_ggtt(vm) &&
>                 i915_gem_obj_ggtt_size(obj) == fence_size &&
>                 (i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
>
> -       mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> -               dev_priv->gtt.mappable_end;
> +       mappable =
> +               i915_is_ggtt(vm) &&
> +               vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>
>         obj->map_and_fenceable = mappable && fenceable;
>
> -       trace_i915_gem_object_bind(obj, map_and_fenceable);
> +       trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
>         i915_gem_verify_gtt(dev);
>         return 0;
>
> @@ -3253,7 +3294,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>         int ret;
>
>         /* Not valid to be called on unbound objects. */
> -       if (!i915_gem_obj_ggtt_bound(obj))
> +       if (!i915_gem_obj_bound_any(obj))
>                 return -EINVAL;
>
>         if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> @@ -3299,11 +3340,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  }
>
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +                                   struct i915_address_space *vm,
>                                     enum i915_cache_level cache_level)
>  {
>         struct drm_device *dev = obj->base.dev;
>         drm_i915_private_t *dev_priv = dev->dev_private;
> -       struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +       struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
>         int ret;
>
>         if (obj->cache_level == cache_level)
> @@ -3315,12 +3357,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>         }
>
>         if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> -               ret = i915_gem_object_unbind(obj);
> +               ret = i915_gem_object_unbind(obj, vm);
>                 if (ret)
>                         return ret;
>         }
>
> -       if (i915_gem_obj_ggtt_bound(obj)) {
> +       list_for_each_entry(vma, &obj->vma_list, vma_link) {
>                 ret = i915_gem_object_finish_gpu(obj);
>                 if (ret)
>                         return ret;
> @@ -3343,7 +3385,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>                         i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
>                                                obj, cache_level);
>
> -               i915_gem_obj_ggtt_set_color(obj, cache_level);
> +               i915_gem_obj_set_color(obj, vma->vm, cache_level);
>         }
>
>         if (cache_level == I915_CACHE_NONE) {
> @@ -3403,6 +3445,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>                                struct drm_file *file)
>  {
>         struct drm_i915_gem_caching *args = data;
> +       struct drm_i915_private *dev_priv;
>         struct drm_i915_gem_object *obj;
>         enum i915_cache_level level;
>         int ret;
> @@ -3427,8 +3470,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>                 ret = -ENOENT;
>                 goto unlock;
>         }
> +       dev_priv = obj->base.dev->dev_private;
>
> -       ret = i915_gem_object_set_cache_level(obj, level);
> +       /* FIXME: Add interface for specific VM? */
> +       ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
>
>         drm_gem_object_unreference(&obj->base);
>  unlock:
> @@ -3446,6 +3491,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>                                      u32 alignment,
>                                      struct intel_ring_buffer *pipelined)
>  {
> +       struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>         u32 old_read_domains, old_write_domain;
>         int ret;
>
> @@ -3464,7 +3510,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>          * of uncaching, which would allow us to flush all the LLC-cached data
>          * with that bit in the PTE to main memory with just one PIPE_CONTROL.
>          */
> -       ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> +       ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +                                             I915_CACHE_NONE);
>         if (ret)
>                 return ret;
>
> @@ -3472,7 +3519,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>          * (e.g. libkms for the bootup splash), we have to ensure that we
>          * always use map_and_fenceable for all scanout buffers.
>          */
> -       ret = i915_gem_object_pin(obj, alignment, true, false);
> +       ret = i915_gem_ggtt_pin(obj, alignment, true, false);
>         if (ret)
>                 return ret;
>
> @@ -3615,6 +3662,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>
>  int
>  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +                   struct i915_address_space *vm,
>                     uint32_t alignment,
>                     bool map_and_fenceable,
>                     bool nonblocking)
> @@ -3624,28 +3672,31 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>         if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>                 return -EBUSY;
>
> -       if (i915_gem_obj_ggtt_bound(obj)) {
> -               if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> +       WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
> +
> +       if (i915_gem_obj_bound(obj, vm)) {
> +               if ((alignment &&
> +                    i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
>                     (map_and_fenceable && !obj->map_and_fenceable)) {
>                         WARN(obj->pin_count,
>                              "bo is already pinned with incorrect alignment:"
>                              " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>                              " obj->map_and_fenceable=%d\n",
> -                            i915_gem_obj_ggtt_offset(obj), alignment,
> +                            i915_gem_obj_offset(obj, vm), alignment,
>                              map_and_fenceable,
>                              obj->map_and_fenceable);
> -                       ret = i915_gem_object_unbind(obj);
> +                       ret = i915_gem_object_unbind(obj, vm);
>                         if (ret)
>                                 return ret;
>                 }
>         }
>
> -       if (!i915_gem_obj_ggtt_bound(obj)) {
> +       if (!i915_gem_obj_bound(obj, vm)) {
>                 struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>
> -               ret = i915_gem_object_bind_to_gtt(obj, alignment,
> -                                                 map_and_fenceable,
> -                                                 nonblocking);
> +               ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
> +                                                map_and_fenceable,
> +                                                nonblocking);
>                 if (ret)
>                         return ret;
>
> @@ -3666,7 +3717,7 @@ void
>  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
>  {
>         BUG_ON(obj->pin_count == 0);
> -       BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> +       BUG_ON(!i915_gem_obj_bound_any(obj));
>
>         if (--obj->pin_count == 0)
>                 obj->pin_mappable = false;
> @@ -3704,7 +3755,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
>         }
>
>         if (obj->user_pin_count == 0) {
> -               ret = i915_gem_object_pin(obj, args->alignment, true, false);
> +               ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
>                 if (ret)
>                         goto out;
>         }
> @@ -3937,6 +3988,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>         struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
>         struct drm_device *dev = obj->base.dev;
>         drm_i915_private_t *dev_priv = dev->dev_private;
> +       struct i915_vma *vma, *next;
>
>         trace_i915_gem_object_destroy(obj);
>
> @@ -3944,15 +3996,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>                 i915_gem_detach_phys_object(dev, obj);
>
>         obj->pin_count = 0;
> -       if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> -               bool was_interruptible;
> +       /* NB: 0 or 1 elements */
> +       WARN_ON(!list_empty(&obj->vma_list) &&
> +               !list_is_singular(&obj->vma_list));
> +       list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> +               int ret = i915_gem_object_unbind(obj, vma->vm);
> +               if (WARN_ON(ret == -ERESTARTSYS)) {
> +                       bool was_interruptible;
>
> -               was_interruptible = dev_priv->mm.interruptible;
> -               dev_priv->mm.interruptible = false;
> +                       was_interruptible = dev_priv->mm.interruptible;
> +                       dev_priv->mm.interruptible = false;
>
> -               WARN_ON(i915_gem_object_unbind(obj));
> +                       WARN_ON(i915_gem_object_unbind(obj, vma->vm));
>
> -               dev_priv->mm.interruptible = was_interruptible;
> +                       dev_priv->mm.interruptible = was_interruptible;
> +               }
>         }
>
>         /* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> @@ -4319,6 +4377,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
>         INIT_LIST_HEAD(&ring->request_list);
>  }
>
> +static void i915_init_vm(struct drm_i915_private *dev_priv,
> +                        struct i915_address_space *vm)
> +{
> +       vm->dev = dev_priv->dev;
> +       INIT_LIST_HEAD(&vm->active_list);
> +       INIT_LIST_HEAD(&vm->inactive_list);
> +       INIT_LIST_HEAD(&vm->global_link);
> +       list_add(&vm->global_link, &dev_priv->vm_list);
> +}
> +
>  void
>  i915_gem_load(struct drm_device *dev)
>  {
> @@ -4331,8 +4399,9 @@ i915_gem_load(struct drm_device *dev)
>                                   SLAB_HWCACHE_ALIGN,
>                                   NULL);
>
> -       INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> -       INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> +       INIT_LIST_HEAD(&dev_priv->vm_list);
> +       i915_init_vm(dev_priv, &dev_priv->gtt.base);
> +
>         INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
>         INIT_LIST_HEAD(&dev_priv->mm.bound_list);
>         INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> @@ -4603,9 +4672,8 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>                              struct drm_i915_private,
>                              mm.inactive_shrinker);
>         struct drm_device *dev = dev_priv->dev;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         struct drm_i915_gem_object *obj;
> -       int nr_to_scan = sc->nr_to_scan;
> +       int nr_to_scan;
>         bool unlock = true;
>         int cnt;
>
> @@ -4619,6 +4687,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>                 unlock = false;
>         }
>
> +       nr_to_scan = sc->nr_to_scan;
>         if (nr_to_scan) {
>                 nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
>                 if (nr_to_scan > 0)
> @@ -4632,11 +4701,109 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>         list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
>                 if (obj->pages_pin_count == 0)
>                         cnt += obj->base.size >> PAGE_SHIFT;
> -       list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +
> +       list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> +               if (obj->active)
> +                       continue;
> +
> +               i915_gem_object_flush_gtt_write_domain(obj);
> +               i915_gem_object_flush_cpu_write_domain(obj);
> +               /* FIXME: Can't assume global gtt */
> +               i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base);
> +
>                 if (obj->pin_count == 0 && obj->pages_pin_count == 0)
>                         cnt += obj->base.size >> PAGE_SHIFT;
> +       }
>
>         if (unlock)
>                 mutex_unlock(&dev->struct_mutex);
>         return cnt;
>  }
> +
> +/* All the new VM stuff */
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +                                 struct i915_address_space *vm)
> +{
> +       struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +       struct i915_vma *vma;
> +
> +       if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +               vm = &dev_priv->gtt.base;
> +
> +       BUG_ON(list_empty(&o->vma_list));
> +       list_for_each_entry(vma, &o->vma_list, vma_link) {
> +               if (vma->vm == vm)
> +                       return vma->node.start;
> +
> +       }
> +       return -1;
> +}
> +
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +                       struct i915_address_space *vm)
> +{
> +       struct i915_vma *vma;
> +
> +       list_for_each_entry(vma, &o->vma_list, vma_link)
> +               if (vma->vm == vm)
> +                       return true;
> +
> +       return false;
> +}
> +
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> +{
> +       struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +       struct i915_address_space *vm;
> +
> +       list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +               if (i915_gem_obj_bound(o, vm))
> +                       return true;
> +
> +       return false;
> +}
> +
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +                               struct i915_address_space *vm)
> +{
> +       struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +       struct i915_vma *vma;
> +
> +       if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +               vm = &dev_priv->gtt.base;
> +
> +       BUG_ON(list_empty(&o->vma_list));
> +
> +       list_for_each_entry(vma, &o->vma_list, vma_link)
> +               if (vma->vm == vm)
> +                       return vma->node.size;
> +
> +       return 0;
> +}
> +
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +                           struct i915_address_space *vm,
> +                           enum i915_cache_level color)
> +{
> +       struct i915_vma *vma;
> +       BUG_ON(list_empty(&o->vma_list));
> +       list_for_each_entry(vma, &o->vma_list, vma_link) {
> +               if (vma->vm == vm) {
> +                       vma->node.color = color;
> +                       return;
> +               }
> +       }
> +
> +       WARN(1, "Couldn't set color for VM %p\n", vm);
> +}
> +
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +                                    struct i915_address_space *vm)
> +{
> +       struct i915_vma *vma;
> +       list_for_each_entry(vma, &obj->vma_list, vma_link)
> +               if (vma->vm == vm)
> +                       return vma;
> +
> +       return NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2470206..873577d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
>
>         if (INTEL_INFO(dev)->gen >= 7) {
>                 ret = i915_gem_object_set_cache_level(ctx->obj,
> +                                                     &dev_priv->gtt.base,
>                                                       I915_CACHE_LLC_MLC);
>                 /* Failure shouldn't ever happen this early */
>                 if (WARN_ON(ret))
> @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
>          * default context.
>          */
>         dev_priv->ring[RCS].default_context = ctx;
> -       ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> +       ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
>         if (ret) {
>                 DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
>                 goto err_destroy;
> @@ -391,6 +392,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  static int do_switch(struct i915_hw_context *to)
>  {
>         struct intel_ring_buffer *ring = to->ring;
> +       struct drm_i915_private *dev_priv = ring->dev->dev_private;
>         struct i915_hw_context *from = ring->last_context;
>         u32 hw_flags = 0;
>         int ret;
> @@ -400,7 +402,7 @@ static int do_switch(struct i915_hw_context *to)
>         if (from == to)
>                 return 0;
>
> -       ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> +       ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
>         if (ret)
>                 return ret;
>
> @@ -437,7 +439,8 @@ static int do_switch(struct i915_hw_context *to)
>          */
>         if (from != NULL) {
>                 from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> -               i915_gem_object_move_to_active(from->obj, ring);
> +               i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> +                                              ring);
>                 /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
>                  * whole damn pipeline, we don't need to explicitly mark the
>                  * object dirty. The only exception is that the context must be
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index df61f33..32efdc0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -32,24 +32,21 @@
>  #include "i915_trace.h"
>
>  static bool
> -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> +mark_free(struct i915_vma *vma, struct list_head *unwind)
>  {
> -       struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> -
> -       if (obj->pin_count)
> +       if (vma->obj->pin_count)
>                 return false;
>
> -       list_add(&obj->exec_list, unwind);
> +       list_add(&vma->obj->exec_list, unwind);
>         return drm_mm_scan_add_block(&vma->node);
>  }
>
>  int
> -i915_gem_evict_something(struct drm_device *dev, int min_size,
> -                        unsigned alignment, unsigned cache_level,
> +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> +                        int min_size, unsigned alignment, unsigned cache_level,
>                          bool mappable, bool nonblocking)
>  {
>         drm_i915_private_t *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         struct list_head eviction_list, unwind_list;
>         struct i915_vma *vma;
>         struct drm_i915_gem_object *obj;
> @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>          */
>
>         INIT_LIST_HEAD(&unwind_list);
> -       if (mappable)
> +       if (mappable) {
> +               BUG_ON(!i915_is_ggtt(vm));
>                 drm_mm_init_scan_with_range(&vm->mm, min_size,
>                                             alignment, cache_level, 0,
>                                             dev_priv->gtt.mappable_end);
> -       else
> +       } else
>                 drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>
>         /* First see if there is a large enough contiguous idle region... */
>         list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -               if (mark_free(obj, &unwind_list))
> +               struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +               if (mark_free(vma, &unwind_list))
>                         goto found;
>         }
>
> @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>
>         /* Now merge in the soon-to-be-expired objects... */
>         list_for_each_entry(obj, &vm->active_list, mm_list) {
> -               if (mark_free(obj, &unwind_list))
> +               struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +               if (mark_free(vma, &unwind_list))
>                         goto found;
>         }
>
> @@ -109,7 +109,7 @@ none:
>                 obj = list_first_entry(&unwind_list,
>                                        struct drm_i915_gem_object,
>                                        exec_list);
> -               vma = __i915_gem_obj_to_vma(obj);
> +               vma = i915_gem_obj_to_vma(obj, vm);
>                 ret = drm_mm_scan_remove_block(&vma->node);
>                 BUG_ON(ret);
>
> @@ -130,7 +130,7 @@ found:
>                 obj = list_first_entry(&unwind_list,
>                                        struct drm_i915_gem_object,
>                                        exec_list);
> -               vma = __i915_gem_obj_to_vma(obj);
> +               vma = i915_gem_obj_to_vma(obj, vm);
>                 if (drm_mm_scan_remove_block(&vma->node)) {
>                         list_move(&obj->exec_list, &eviction_list);
>                         drm_gem_object_reference(&obj->base);
> @@ -145,7 +145,7 @@ found:
>                                        struct drm_i915_gem_object,
>                                        exec_list);
>                 if (ret == 0)
> -                       ret = i915_gem_object_unbind(obj);
> +                       ret = i915_gem_object_unbind(obj, vm);
>
>                 list_del_init(&obj->exec_list);
>                 drm_gem_object_unreference(&obj->base);
> @@ -158,13 +158,18 @@ int
>  i915_gem_evict_everything(struct drm_device *dev)
>  {
>         drm_i915_private_t *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
> +       struct i915_address_space *vm;
>         struct drm_i915_gem_object *obj, *next;
> -       bool lists_empty;
> +       bool lists_empty = true;
>         int ret;
>
> -       lists_empty = (list_empty(&vm->inactive_list) &&
> -                      list_empty(&vm->active_list));
> +       list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +               lists_empty = (list_empty(&vm->inactive_list) &&
> +                              list_empty(&vm->active_list));
> +               if (!lists_empty)
> +                       lists_empty = false;
> +       }
> +
>         if (lists_empty)
>                 return -ENOSPC;
>
> @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
>         i915_gem_retire_requests(dev);
>
>         /* Having flushed everything, unbind() should never raise an error */
> -       list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -               if (obj->pin_count == 0)
> -                       WARN_ON(i915_gem_object_unbind(obj));
> +       list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +               list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> +                       if (obj->pin_count == 0)
> +                               WARN_ON(i915_gem_object_unbind(obj, vm));
> +       }
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 1734825..819d8d8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
>  }
>
>  static void
> -eb_destroy(struct eb_objects *eb)
> +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
>  {
>         while (!list_empty(&eb->objects)) {
>                 struct drm_i915_gem_object *obj;
> @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>                                    struct eb_objects *eb,
> -                                  struct drm_i915_gem_relocation_entry *reloc)
> +                                  struct drm_i915_gem_relocation_entry *reloc,
> +                                  struct i915_address_space *vm)
>  {
>         struct drm_device *dev = obj->base.dev;
>         struct drm_gem_object *target_obj;
> @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>
>  static int
>  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> -                                   struct eb_objects *eb)
> +                                   struct eb_objects *eb,
> +                                   struct i915_address_space *vm)
>  {
>  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
>         struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>                 do {
>                         u64 offset = r->presumed_offset;
>
> -                       ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> +                       ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> +                                                                vm);
>                         if (ret)
>                                 return ret;
>
> @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  static int
>  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>                                          struct eb_objects *eb,
> -                                        struct drm_i915_gem_relocation_entry *relocs)
> +                                        struct drm_i915_gem_relocation_entry *relocs,
> +                                        struct i915_address_space *vm)
>  {
>         const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>         int i, ret;
>
>         for (i = 0; i < entry->relocation_count; i++) {
> -               ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> +               ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> +                                                        vm);
>                 if (ret)
>                         return ret;
>         }
> @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  }
>
>  static int
> -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> +                            struct i915_address_space *vm)
>  {
>         struct drm_i915_gem_object *obj;
>         int ret = 0;
> @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
>          */
>         pagefault_disable();
>         list_for_each_entry(obj, &eb->objects, exec_list) {
> -               ret = i915_gem_execbuffer_relocate_object(obj, eb);
> +               ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
>                 if (ret)
>                         break;
>         }
> @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>                                    struct intel_ring_buffer *ring,
> +                                  struct i915_address_space *vm,
>                                    bool *need_reloc)
>  {
>         struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>                 obj->tiling_mode != I915_TILING_NONE;
>         need_mappable = need_fence || need_reloc_mappable(obj);
>
> -       ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> +       ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> +                                 false);
>         if (ret)
>                 return ret;
>
> @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>                 obj->has_aliasing_ppgtt_mapping = 1;
>         }
>
> -       if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> -               entry->offset = i915_gem_obj_ggtt_offset(obj);
> +       if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> +               entry->offset = i915_gem_obj_offset(obj, vm);
>                 *need_reloc = true;
>         }
>
> @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  {
>         struct drm_i915_gem_exec_object2 *entry;
>
> -       if (!i915_gem_obj_ggtt_bound(obj))
> +       if (!i915_gem_obj_bound_any(obj))
>                 return;
>
>         entry = obj->exec_entry;
> @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>                             struct list_head *objects,
> +                           struct i915_address_space *vm,
>                             bool *need_relocs)
>  {
>         struct drm_i915_gem_object *obj;
> @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>                 list_for_each_entry(obj, objects, exec_list) {
>                         struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>                         bool need_fence, need_mappable;
> +                       u32 obj_offset;
>
> -                       if (!i915_gem_obj_ggtt_bound(obj))
> +                       if (!i915_gem_obj_bound(obj, vm))
>                                 continue;
>
> +                       obj_offset = i915_gem_obj_offset(obj, vm);
>                         need_fence =
>                                 has_fenced_gpu_access &&
>                                 entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>                                 obj->tiling_mode != I915_TILING_NONE;
>                         need_mappable = need_fence || need_reloc_mappable(obj);
>
> +                       BUG_ON((need_mappable || need_fence) &&
> +                              !i915_is_ggtt(vm));
> +
>                         if ((entry->alignment &&
> -                            i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> +                            obj_offset & (entry->alignment - 1)) ||
>                             (need_mappable && !obj->map_and_fenceable))
> -                               ret = i915_gem_object_unbind(obj);
> +                               ret = i915_gem_object_unbind(obj, vm);
>                         else
> -                               ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +                               ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>                         if (ret)
>                                 goto err;
>                 }
>
>                 /* Bind fresh objects */
>                 list_for_each_entry(obj, objects, exec_list) {
> -                       if (i915_gem_obj_ggtt_bound(obj))
> +                       if (i915_gem_obj_bound(obj, vm))
>                                 continue;
>
> -                       ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +                       ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>                         if (ret)
>                                 goto err;
>                 }
> @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>                                   struct drm_file *file,
>                                   struct intel_ring_buffer *ring,
>                                   struct eb_objects *eb,
> -                                 struct drm_i915_gem_exec_object2 *exec)
> +                                 struct drm_i915_gem_exec_object2 *exec,
> +                                 struct i915_address_space *vm)
>  {
>         struct drm_i915_gem_relocation_entry *reloc;
>         struct drm_i915_gem_object *obj;
> @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>                 goto err;
>
>         need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -       ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +       ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>         if (ret)
>                 goto err;
>
>         list_for_each_entry(obj, &eb->objects, exec_list) {
>                 int offset = obj->exec_entry - exec;
>                 ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> -                                                              reloc + reloc_offset[offset]);
> +                                                              reloc + reloc_offset[offset],
> +                                                              vm);
>                 if (ret)
>                         goto err;
>         }
> @@ -770,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
>
>  static void
>  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> +                                  struct i915_address_space *vm,
>                                    struct intel_ring_buffer *ring)
>  {
>         struct drm_i915_gem_object *obj;
> @@ -784,7 +801,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
>                 obj->base.read_domains = obj->base.pending_read_domains;
>                 obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>
> -               i915_gem_object_move_to_active(obj, ring);
> +               i915_gem_object_move_to_active(obj, vm, ring);
>                 if (obj->base.write_domain) {
>                         obj->dirty = 1;
>                         obj->last_write_seqno = intel_ring_get_seqno(ring);
> @@ -838,7 +855,8 @@ static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>                        struct drm_file *file,
>                        struct drm_i915_gem_execbuffer2 *args,
> -                      struct drm_i915_gem_exec_object2 *exec)
> +                      struct drm_i915_gem_exec_object2 *exec,
> +                      struct i915_address_space *vm)
>  {
>         drm_i915_private_t *dev_priv = dev->dev_private;
>         struct eb_objects *eb;
> @@ -1000,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>
>         /* Move the objects en-masse into the GTT, evicting if necessary. */
>         need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -       ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +       ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>         if (ret)
>                 goto err;
>
>         /* The objects are in their final locations, apply the relocations. */
>         if (need_relocs)
> -               ret = i915_gem_execbuffer_relocate(eb);
> +               ret = i915_gem_execbuffer_relocate(eb, vm);
>         if (ret) {
>                 if (ret == -EFAULT) {
>                         ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> -                                                               eb, exec);
> +                                                               eb, exec, vm);
>                         BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>                 }
>                 if (ret)
> @@ -1061,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>                         goto err;
>         }
>
> -       exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> +       exec_start = i915_gem_obj_offset(batch_obj, vm) +
> +               args->batch_start_offset;
>         exec_len = args->batch_len;
>         if (cliprects) {
>                 for (i = 0; i < args->num_cliprects; i++) {
> @@ -1086,11 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>
>         trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
>
> -       i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> +       i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
>         i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
>
>  err:
> -       eb_destroy(eb);
> +       eb_destroy(eb, vm);
>
>         mutex_unlock(&dev->struct_mutex);
>
> @@ -1107,6 +1126,7 @@ int
>  i915_gem_execbuffer(struct drm_device *dev, void *data,
>                     struct drm_file *file)
>  {
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         struct drm_i915_gem_execbuffer *args = data;
>         struct drm_i915_gem_execbuffer2 exec2;
>         struct drm_i915_gem_exec_object *exec_list = NULL;
> @@ -1162,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
>         exec2.flags = I915_EXEC_RENDER;
>         i915_execbuffer2_set_context_id(exec2, 0);
>
> -       ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> +       ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> +                                    &dev_priv->gtt.base);
>         if (!ret) {
>                 /* Copy the new buffer offsets back to the user's exec list. */
>                 for (i = 0; i < args->buffer_count; i++)
> @@ -1188,6 +1209,7 @@ int
>  i915_gem_execbuffer2(struct drm_device *dev, void *data,
>                      struct drm_file *file)
>  {
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         struct drm_i915_gem_execbuffer2 *args = data;
>         struct drm_i915_gem_exec_object2 *exec2_list = NULL;
>         int ret;
> @@ -1218,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
>                 return -EFAULT;
>         }
>
> -       ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> +       ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> +                                    &dev_priv->gtt.base);
>         if (!ret) {
>                 /* Copy the new buffer offsets back to the user's exec list. */
>                 ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 3b639a9..44f3464 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -390,6 +390,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
>                             ppgtt->base.total);
>         }
>
> +       /* i915_init_vm(dev_priv, &ppgtt->base) */
> +
>         return ret;
>  }
>
> @@ -409,17 +411,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>                             struct drm_i915_gem_object *obj,
>                             enum i915_cache_level cache_level)
>  {
> -       ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> -                                  i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -                                  cache_level);
> +       struct i915_address_space *vm = &ppgtt->base;
> +       unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +       vm->insert_entries(vm, obj->pages,
> +                          obj_offset >> PAGE_SHIFT,
> +                          cache_level);
>  }
>
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>                               struct drm_i915_gem_object *obj)
>  {
> -       ppgtt->base.clear_range(&ppgtt->base,
> -                               i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -                               obj->base.size >> PAGE_SHIFT);
> +       struct i915_address_space *vm = &ppgtt->base;
> +       unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +       vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> +                       obj->base.size >> PAGE_SHIFT);
>  }
>
>  extern int intel_iommu_gfx_mapped;
> @@ -470,6 +477,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>                                        dev_priv->gtt.base.start / PAGE_SIZE,
>                                        dev_priv->gtt.base.total / PAGE_SIZE);
>
> +       if (dev_priv->mm.aliasing_ppgtt)
> +               gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +
>         list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>                 i915_gem_clflush_object(obj);
>                 i915_gem_gtt_bind_object(obj, obj->cache_level);
> @@ -648,7 +658,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>          * aperture.  One page should be enough to keep any prefetching inside
>          * of the aperture.
>          */
> -       drm_i915_private_t *dev_priv = dev->dev_private;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
> +       struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
>         struct drm_mm_node *entry;
>         struct drm_i915_gem_object *obj;
>         unsigned long hole_start, hole_end;
> @@ -656,19 +667,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>         BUG_ON(mappable_end > end);
>
>         /* Subtract the guard page ... */
> -       drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> +       drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
>         if (!HAS_LLC(dev))
>                 dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
>
>         /* Mark any preallocated objects as occupied */
>         list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -               struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +               struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
>                 int ret;
>                 DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
>                               i915_gem_obj_ggtt_offset(obj), obj->base.size);
>
>                 WARN_ON(i915_gem_obj_ggtt_bound(obj));
> -               ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +               ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
>                 if (ret)
>                         DRM_DEBUG_KMS("Reservation failed\n");
>                 obj->has_global_gtt_mapping = 1;
> @@ -679,19 +690,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>         dev_priv->gtt.base.total = end - start;
>
>         /* Clear any non-preallocated blocks */
> -       drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> -                            hole_start, hole_end) {
> +       drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
>                 const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
>                 DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>                               hole_start, hole_end);
> -               dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -                                              hole_start / PAGE_SIZE,
> -                                              count);
> +               ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
>         }
>
>         /* And finally clear the reserved guard page */
> -       dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -                                      end / PAGE_SIZE - 1, 1);
> +       ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
>  }
>
>  static bool
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 27ffb4c..000ffbd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>                                                u32 size)
>  {
>         struct drm_i915_private *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
> +       struct i915_address_space *ggtt = &dev_priv->gtt.base;
>         struct drm_i915_gem_object *obj;
>         struct drm_mm_node *stolen;
>         struct i915_vma *vma;
> @@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>         if (gtt_offset == I915_GTT_OFFSET_NONE)
>                 return obj;
>
> -       vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +       vma = i915_gem_vma_create(obj, ggtt);
>         if (IS_ERR(vma)) {
>                 ret = PTR_ERR(vma);
>                 goto err_out;
> @@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>          */
>         vma->node.start = gtt_offset;
>         vma->node.size = size;
> -       if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> -               ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +       if (drm_mm_initialized(&ggtt->mm)) {
> +               ret = drm_mm_reserve_node(&ggtt->mm, &vma->node);
>                 if (ret) {
>                         DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
>                         i915_gem_vma_destroy(vma);
> @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>         obj->has_global_gtt_mapping = 1;
>
>         list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -       list_add_tail(&obj->mm_list, &vm->inactive_list);
> +       list_add_tail(&obj->mm_list, &ggtt->inactive_list);
>
>         return obj;
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 92a8d27..808ca2a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>
>                 obj->map_and_fenceable =
>                         !i915_gem_obj_ggtt_bound(obj) ||
> -                       (i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> +                       (i915_gem_obj_ggtt_offset(obj) +
> +                        obj->base.size <= dev_priv->gtt.mappable_end &&
>                          i915_gem_object_fence_ok(obj, args->tiling_mode));
>
>                 /* Rebind if we need a change of alignment */
>                 if (!obj->map_and_fenceable) {
> -                       u32 unfenced_alignment =
> +                       struct i915_address_space *ggtt = &dev_priv->gtt.base;
> +                       u32 unfenced_align =
>                                 i915_gem_get_gtt_alignment(dev, obj->base.size,
>                                                             args->tiling_mode,
>                                                             false);
> -                       if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> -                               ret = i915_gem_object_unbind(obj);
> +                       if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> +                               ret = i915_gem_object_unbind(obj, ggtt);
>                 }
>
>                 if (ret == 0) {
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 7d283b5..3f019d3 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
>  );
>
>  TRACE_EVENT(i915_gem_object_bind,
> -           TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> -           TP_ARGS(obj, mappable),
> +           TP_PROTO(struct drm_i915_gem_object *obj,
> +                    struct i915_address_space *vm, bool mappable),
> +           TP_ARGS(obj, vm, mappable),
>
>             TP_STRUCT__entry(
>                              __field(struct drm_i915_gem_object *, obj)
> +                            __field(struct i915_address_space *, vm)
>                              __field(u32, offset)
>                              __field(u32, size)
>                              __field(bool, mappable)
> @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
>
>             TP_fast_assign(
>                            __entry->obj = obj;
> -                          __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -                          __entry->size = i915_gem_obj_ggtt_size(obj);
> +                          __entry->offset = i915_gem_obj_offset(obj, vm);
> +                          __entry->size = i915_gem_obj_size(obj, vm);
>                            __entry->mappable = mappable;
>                            ),
>
> @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
>  );
>
>  TRACE_EVENT(i915_gem_object_unbind,
> -           TP_PROTO(struct drm_i915_gem_object *obj),
> -           TP_ARGS(obj),
> +           TP_PROTO(struct drm_i915_gem_object *obj,
> +                    struct i915_address_space *vm),
> +           TP_ARGS(obj, vm),
>
>             TP_STRUCT__entry(
>                              __field(struct drm_i915_gem_object *, obj)
> +                            __field(struct i915_address_space *, vm)
>                              __field(u32, offset)
>                              __field(u32, size)
>                              ),
>
>             TP_fast_assign(
>                            __entry->obj = obj;
> -                          __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -                          __entry->size = i915_gem_obj_ggtt_size(obj);
> +                          __entry->offset = i915_gem_obj_offset(obj, vm);
> +                          __entry->size = i915_gem_obj_size(obj, vm);
>                            ),
>
>             TP_printk("obj=%p, offset=%08x size=%x",
> diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> index f3c97e0..b69cc63 100644
> --- a/drivers/gpu/drm/i915/intel_fb.c
> +++ b/drivers/gpu/drm/i915/intel_fb.c
> @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
>                       fb->width, fb->height,
>                       i915_gem_obj_ggtt_offset(obj), obj);
>
> -
>         mutex_unlock(&dev->struct_mutex);
>         vga_switcheroo_client_fb_set(dev->pdev, info);
>         return 0;
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 2abb53e..22ccb7e 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
>                 }
>                 overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
>         } else {
> -               ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> +               ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
>                 if (ret) {
>                         DRM_ERROR("failed to pin overlay register bo\n");
>                         goto out_free_bo;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 008e0e0..0fb081c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev)
>                 return NULL;
>         }
>
> -       ret = i915_gem_object_pin(ctx, 4096, true, false);
> +       ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
>         if (ret) {
>                 DRM_ERROR("failed to pin power context: %d\n", ret);
>                 goto err_unref;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8527ea0..88130a3 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -481,6 +481,7 @@ out:
>  static int
>  init_pipe_control(struct intel_ring_buffer *ring)
>  {
> +       struct drm_i915_private *dev_priv = ring->dev->dev_private;
>         struct pipe_control *pc;
>         struct drm_i915_gem_object *obj;
>         int ret;
> @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
>                 goto err;
>         }
>
> -       i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +       i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +                                       I915_CACHE_LLC);
>
> -       ret = i915_gem_object_pin(obj, 4096, true, false);
> +       ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>         if (ret)
>                 goto err_unref;
>
> @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
>  static int init_status_page(struct intel_ring_buffer *ring)
>  {
>         struct drm_device *dev = ring->dev;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         struct drm_i915_gem_object *obj;
>         int ret;
>
> @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
>                 goto err;
>         }
>
> -       i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +       i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +                                       I915_CACHE_LLC);
>
> -       ret = i915_gem_object_pin(obj, 4096, true, false);
> +       ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>         if (ret != 0) {
>                 goto err_unref;
>         }
> @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>
>         ring->obj = obj;
>
> -       ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> +       ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
>         if (ret)
>                 goto err_unref;
>
> @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>                         return -ENOMEM;
>                 }
>
> -               ret = i915_gem_object_pin(obj, 0, true, false);
> +               ret = i915_gem_ggtt_pin(obj, 0, true, false);
>                 if (ret != 0) {
>                         drm_gem_object_unreference(&obj->base);
>                         DRM_ERROR("Failed to ping batch bo\n");
> --
> 1.8.3.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26  9:51   ` Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations) Daniel Vetter
@ 2013-07-26 16:59     ` Jesse Barnes
  2013-07-26 17:08       ` Chris Wilson
  2013-07-26 17:40       ` Daniel Vetter
  2013-07-26 20:15     ` Ben Widawsky
  1 sibling, 2 replies; 48+ messages in thread
From: Jesse Barnes @ 2013-07-26 16:59 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Ben Widawsky, Intel GFX

On Fri, 26 Jul 2013 11:51:00 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> HI all,
> 
> So Ben&I had a bit a private discussion and one thing I've explained a bit
> more in detail is what kind of review I'm doing as maintainer. I've
> figured this is generally useful. We've also discussed a bit that for
> developers without their own lab it would be nice if QA could test random
> branches on their set of machines. But imo that'll take quite a while,
> there's lots of other stuff to improve in QA land first. Anyway, here's
> it:
> 
> Now an explanation for why this freaked me out, which is essentially an
> explanation of what I do when I do maintainer reviews:
> 
> Probably the most important question I ask myself when reading a patch is
> "if a regression would bisect to this, and the bisect is the only useful
> piece of evidence, would I stand a chance to understand it?".  Your patch
> is big, has the appearance of doing a few unrelated things and could very
> well hide a bug which would take me an awful lot of time to spot. So imo
> the answer for your patch is a clear "no".

This is definitely a good point.  Big patches are both hard to review
and hard to debug, so should be kept as simple as possible (but no
simpler!).

> I've merged a few such patches in the past  where I've had a similar hunch
> and regretted it almost always. I've also sometimes split-up the patch
> while applying, but that approach doesn't scale any more with our rather
> big team.
> 
> The second thing I try to figure out is whether the patch author is indeed
> the local expert on the topic at hand now. With our team size and patch
> flow I don't stand a chance if I try to understand everything to the last
> detail. Instead I try to assess this through the proxy of convincing
> myself the the patch submitter understands stuff much better than I do. I
> tend to check that by asking random questions, proposing alternative
> approaches and also by rating code/patch clarity. The obj_set_color
> double-loop very much gave me the impression that you didn't have a clear
> idea about how exactly this should work, so that  hunk trigger this
> maintainer hunch.

This is the part I think is unfair (see below) when proposed
alternatives aren't clearly defined.

> I admit that this is all rather fluffy and very much an inexact science,
> but it's the only tools I have as a maintainer. The alternative of doing
> shit myself or checking everything myself in-depth just doesnt scale.

I'm glad you brought this up, but I see a contradiction here:  if
you're just asking random questions to convince yourself the author
knows what they're doing, but simultaneously you're not checking
everything yourself in-depth, you'll have no way to know whether your
questions are being dealt with properly.

I think the way out of that contradiction is to trust reviewers,
especially in specific areas.

There's a downside in that the design will be a little less coherent
(i.e. matching the vision of a single person), but as you said, that
doesn't scale.

So I'd suggest a couple of rules to help:
  1) every patch gets at least two reviewed-bys
  2) one of those reviewed-bys should be from a domain expert, e.g.:
     DP - Todd, Jani
     GEM - Chris, Daniel
     $PLATFORM - $PLATFORM owner
     HDMI - Paulo
     PSR/FBC - Rodrigo/Shobhit
     * - Daniel (you get to be a wildcard)
     etc.
  3) reviews aren't allowed to contain solely bikeshed/codingstyle
     change requests, if there's nothing substantial merge shouldn't be
     blocked (modulo egregious violations like Hungarian notation)
  4) review comments should be concrete and actionable, and ideally not
     leave the author hanging with hints about problems the reviewer
     has spotted, leaving the author looking for easter eggs

For the most part I think we adhere to this, though reviews from the
domain experts are done more on an ad-hoc basis these days...

Thoughts?

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26 16:59     ` Jesse Barnes
@ 2013-07-26 17:08       ` Chris Wilson
  2013-07-26 17:12         ` Jesse Barnes
  2013-07-26 17:40       ` Daniel Vetter
  1 sibling, 1 reply; 48+ messages in thread
From: Chris Wilson @ 2013-07-26 17:08 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Ben Widawsky, Intel GFX

On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote:
>   4) review comments should be concrete and actionable, and ideally not
>      leave the author hanging with hints about problems the reviewer
>      has spotted, leaving the author looking for easter eggs

Where am I going to find my fun, if I am not allowed to tell you that
you missed a zero in a thousand line patch but not tell you where?
Spoilsport :-p
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26 17:08       ` Chris Wilson
@ 2013-07-26 17:12         ` Jesse Barnes
  2013-08-04 20:31           ` Daniel Vetter
  0 siblings, 1 reply; 48+ messages in thread
From: Jesse Barnes @ 2013-07-26 17:12 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Ben Widawsky, Intel GFX

On Fri, 26 Jul 2013 18:08:48 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:

> On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote:
> >   4) review comments should be concrete and actionable, and ideally not
> >      leave the author hanging with hints about problems the reviewer
> >      has spotted, leaving the author looking for easter eggs
> 
> Where am I going to find my fun, if I am not allowed to tell you that
> you missed a zero in a thousand line patch but not tell you where?
> Spoilsport :-p

You'll just need to take up golf or something. :)

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26 16:59     ` Jesse Barnes
  2013-07-26 17:08       ` Chris Wilson
@ 2013-07-26 17:40       ` Daniel Vetter
  1 sibling, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-07-26 17:40 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Ben Widawsky, Intel GFX

On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote:
> On Fri, 26 Jul 2013 11:51:00 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > HI all,
> > 
> > So Ben&I had a bit a private discussion and one thing I've explained a bit
> > more in detail is what kind of review I'm doing as maintainer. I've
> > figured this is generally useful. We've also discussed a bit that for
> > developers without their own lab it would be nice if QA could test random
> > branches on their set of machines. But imo that'll take quite a while,
> > there's lots of other stuff to improve in QA land first. Anyway, here's
> > it:
> > 
> > Now an explanation for why this freaked me out, which is essentially an
> > explanation of what I do when I do maintainer reviews:
> > 
> > Probably the most important question I ask myself when reading a patch is
> > "if a regression would bisect to this, and the bisect is the only useful
> > piece of evidence, would I stand a chance to understand it?".  Your patch
> > is big, has the appearance of doing a few unrelated things and could very
> > well hide a bug which would take me an awful lot of time to spot. So imo
> > the answer for your patch is a clear "no".
> 
> This is definitely a good point.  Big patches are both hard to review
> and hard to debug, so should be kept as simple as possible (but no
> simpler!).
> 
> > I've merged a few such patches in the past  where I've had a similar hunch
> > and regretted it almost always. I've also sometimes split-up the patch
> > while applying, but that approach doesn't scale any more with our rather
> > big team.
> > 
> > The second thing I try to figure out is whether the patch author is indeed
> > the local expert on the topic at hand now. With our team size and patch
> > flow I don't stand a chance if I try to understand everything to the last
> > detail. Instead I try to assess this through the proxy of convincing
> > myself the the patch submitter understands stuff much better than I do. I
> > tend to check that by asking random questions, proposing alternative
> > approaches and also by rating code/patch clarity. The obj_set_color
> > double-loop very much gave me the impression that you didn't have a clear
> > idea about how exactly this should work, so that  hunk trigger this
> > maintainer hunch.
> 
> This is the part I think is unfair (see below) when proposed
> alternatives aren't clearly defined.

Ben split up the patches meanwhile and imo they now look great (so fully
address the first concern). I've read through them this morning and dumped
a few (imo actionable) quick comments on irc. For the example here my
request is to squash a double-loop over vma lists (which will also rip out
a function call indirection as a bonus).

> > I admit that this is all rather fluffy and very much an inexact science,
> > but it's the only tools I have as a maintainer. The alternative of doing
> > shit myself or checking everything myself in-depth just doesnt scale.
> 
> I'm glad you brought this up, but I see a contradiction here:  if
> you're just asking random questions to convince yourself the author
> knows what they're doing, but simultaneously you're not checking
> everything yourself in-depth, you'll have no way to know whether your
> questions are being dealt with properly.

Well if the reply is unsure or inconstistent then I tend to dig in. E.g.
with Paulo's pc8+ stuff I've asked a few questions about interactions with
gmbus/edid reading/gem execbuf and he replied that he doesn't know. His
2nd patch version was still a bit thin on details in that area, so I've
sat down read through stuff and made a concrete&actionable list of
corner-cases I think we should exercise.

> I think the way out of that contradiction is to trust reviewers,
> especially in specific areas.
 
Imo I've already started with that, there's lots of patches where I only
do a very cursory read when merging since I trust $AUTHOR and $REVIEWER
to get it right.

> There's a downside in that the design will be a little less coherent
> (i.e. matching the vision of a single person), but as you said, that
> doesn't scale.

I think overall we can still achieve good consistency in the design, so
that's a part where I try to chip in. But with a larger team it's clear
that consistency in little details will fizzle out more, otoh doing such
cleanups after big reworks (heck I've been rather inconstinent in all the
refactoring in the modeset code myself) sounds like good material to drag
newbies into our codebase.
 
> So I'd suggest a couple of rules to help:
>   1) every patch gets at least two reviewed-bys

We have a hard time doing our current review load in a timely manner
already, I don't expect this to scale if we do it formally. But ...

>   2) one of those reviewed-bys should be from a domain expert, e.g.:
>      DP - Todd, Jani
>      GEM - Chris, Daniel
>      $PLATFORM - $PLATFORM owner
>      HDMI - Paulo
>      PSR/FBC - Rodrigo/Shobhit
>      * - Daniel (you get to be a wildcard)
>      etc.

... this is something that I've started to take into account already. E.g.
when I ask someone less experienced for a given topic to do a
fish-out-of-water review I'll also poke domain experts to ack it. And if
there's a concern it obviously overrules an r-b tag from someone else.

>   3) reviews aren't allowed to contain solely bikeshed/codingstyle
>      change requests, if there's nothing substantial merge shouldn't be
>      blocked (modulo egregious violations like Hungarian notation)

I think we're doing fairly well. Occasionally I rant around review myself,
but often that's just the schlep of digging the patch out again and
refining it - most often the reviewer is right, which obviously makes it
worse ;-)

We have a few cases where discussions tend to loop forever. Sometimes I
step in but often I feel like I shouldn't be the one to make the call,
e.g. the audio discussions around the hsw power well drag out often, but
imo that's a topic where Paulo should make the calls.

Occasionally though I block a patch on bikeshed topics simply because I
think the improved consistency is worth it. One example is the gen checks
so that our code matches 0-based C array semantics and our usual writing
style of using genX+ and pre-genX to be inclusive/exclusive respectively.

>   4) review comments should be concrete and actionable, and ideally not
>      leave the author hanging with hints about problems the reviewer
>      has spotted, leaving the author looking for easter eggs

Where's the fun in that? I think the right way to look at easter egg
hunting is that the clear&actionable task from the reviewer is to go
easter egg hunting ;-)

More seriously though asking "what happens if?" questions is an important
part of review imo, and sometimes those tend to be an easter egg hunt for
both reviewer and patch author."

> For the most part I think we adhere to this, though reviews from the
> domain experts are done more on an ad-hoc basis these days...
> 
> Thoughts?

Generally I think our overall process is
a) a mess (as in not really formalized much) and
b) works surprisingly well.

So I think fine-tuning of individual parts and having an occasional
process discussion should be good enough to keep going.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26  9:51   ` Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations) Daniel Vetter
  2013-07-26 16:59     ` Jesse Barnes
@ 2013-07-26 20:15     ` Ben Widawsky
  2013-07-26 20:43       ` Daniel Vetter
  1 sibling, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-26 20:15 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Fri, Jul 26, 2013 at 11:51:00AM +0200, Daniel Vetter wrote:
> HI all,
> 
> So Ben&I had a bit a private discussion and one thing I've explained a bit
> more in detail is what kind of review I'm doing as maintainer. I've
> figured this is generally useful. We've also discussed a bit that for
> developers without their own lab it would be nice if QA could test random
> branches on their set of machines. But imo that'll take quite a while,
> there's lots of other stuff to improve in QA land first. Anyway, here's
> it:
> 
> Now an explanation for why this freaked me out, which is essentially an
> explanation of what I do when I do maintainer reviews:
> 
> Probably the most important question I ask myself when reading a patch is
> "if a regression would bisect to this, and the bisect is the only useful
> piece of evidence, would I stand a chance to understand it?".  Your patch
> is big, has the appearance of doing a few unrelated things and could very
> well hide a bug which would take me an awful lot of time to spot. So imo
> the answer for your patch is a clear "no".
> 
> I've merged a few such patches in the past  where I've had a similar hunch
> and regretted it almost always. I've also sometimes split-up the patch
> while applying, but that approach doesn't scale any more with our rather
> big team.

You should never do this, IMO. If you require the patches to be split in
your tree, the developer should do it. See below for reasons I think
this sucks.

> 
> The second thing I try to figure out is whether the patch author is indeed
> the local expert on the topic at hand now. With our team size and patch
> flow I don't stand a chance if I try to understand everything to the last
> detail. Instead I try to assess this through the proxy of convincing
> myself the the patch submitter understands stuff much better than I do. I
> tend to check that by asking random questions, proposing alternative
> approaches and also by rating code/patch clarity. The obj_set_color
> double-loop very much gave me the impression that you didn't have a clear
> idea about how exactly this should work, so that  hunk trigger this
> maintainer hunch.
> 
> I admit that this is all rather fluffy and very much an inexact science,
> but it's the only tools I have as a maintainer. The alternative of doing
> shit myself or checking everything myself in-depth just doesnt scale.
> 
> Cheers, Daniel
> 
> 
> On Mon, Jul 22, 2013 at 4:08 AM, Ben Widawsky <ben@bwidawsk.net> wrote:

I think the subthread Jesse started had a bunch of good points, but
concisely I see 3 problems with our current process (and these were
addressed in my original mail, but I guess you didn't want to air my
dirty laundry :p):

1. Delay hurts QA. Balking on patches because they're hard to review
limits QA on that patch, and reduces QA time on the fixed up patches. I
agree this is something which is fixable within QA, but it doesn't exist
at present.

2. We don't have a way to bound review/merge. I tried to do this on this
series. After your initial review, I gave a list of things I was going
to fix, and asked you for an ack that if I fixed those, you would merge.
IMO, you didn't stick to this agreement, and came back with rework
requests on a patch I had already submitted. I don't know how to fix
this one because I think you should be entitled to change your mind.

A caveat to this: I did make some mistakes on rebase that needed
addressing. ie. the ends justified the means.

3a. Reworking code introduces bugs. I feel I am more guilty here than
most, but, consider even in the best case of those new bugs being
caught in review. In such a case, you've now introduced at least 2 extra
revs, and 2 extra lag cycles waiting for review. That assumes further
work doesn't spiral into more requested fixups, or more bugs. In the
less ideal case, you've simply introduced a new bug in addition to the
delay.

3b. Patch splitting is art not science.

There is a really delicate balance between splitting patches because
it's logically a functional split vs. splitting things up to make things
easier to chew on. Now in my case specifically, I think overall the
series has improved, and I found some crud that got squashed in which
shouldn't have been there. I also believe a lot of the splitting really
doesn't make much sense other than for review purposes and sometimes
that is okay.

In my case, I had a huge patch, but a lot of that patch was actually a
sed job of "s/obj/obj,vm/."  You came back with, "you're doing a bunch
of extra lookups." That was exactly the point of the patch; the extra
lookups should have made the review simpler, and could be cleaned up
later.

My point is: A larger quantity of small patches is not always easier to
review than a small quantity of large patches. Large patch series review
often requires the reviewer to keep a lot of context as they review.

*4. The result of all this is I think a lot of the time we (the
developers) end up writing your patch for you. While I respect your
opinion very highly, and I think more often than not that your way is
better, it's just inefficient.


I'll wrap this all up with, I don't envy you. On a bunch of emails, I've
seen you be apologetic for putting developers in between a rock, and a
hard place (you, and program management). I recognize you have the same
dilemma with Dave/Linus, and the rest of us developers. I think the
overall strategy should be to improve QA, but then you have to take the
leap of limiting your requests for reworks, and accepting QAs stamp of
approval.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26 20:15     ` Ben Widawsky
@ 2013-07-26 20:43       ` Daniel Vetter
  2013-07-26 23:13         ` Dave Airlie
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Vetter @ 2013-07-26 20:43 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Fri, Jul 26, 2013 at 10:15 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> I think the subthread Jesse started had a bunch of good points, but
> concisely I see 3 problems with our current process (and these were
> addressed in my original mail, but I guess you didn't want to air my
> dirty laundry :p):

I've cut out some of the later discussion in my mail (and that thread)
since I've figured it's not the main point I wanted to make. No fear
of dirty laundry ;-)

>
> 1. Delay hurts QA. Balking on patches because they're hard to review
> limits QA on that patch, and reduces QA time on the fixed up patches. I
> agree this is something which is fixable within QA, but it doesn't exist
> at present.

Yeah, I agree that this is an issue for developers without their
private lab ;-) And it's also an issue for those with one, since
running tests without a good fully automated system is a pian.

With discussed this a bit with Jesse yesterday on irc, but my point is
that currentl QA doesn't have a quick enough turn-around even for
testing -nightly that this would be feasible. And I also think that
something like this should be started with userspace (i.e. mesa)
testing first, which is already in progress.

Once QA has infrastructure to test arbitrary branches and once they
have enough horsepower and automation (and people to do all this) we
can take a look again. But imo trying to do this early is just wishful
thinking, we have to deal with what we have, not what we'd like to get
for Xmas.

> 2. We don't have a way to bound review/merge. I tried to do this on this
> series. After your initial review, I gave a list of things I was going
> to fix, and asked you for an ack that if I fixed those, you would merge.
> IMO, you didn't stick to this agreement, and came back with rework
> requests on a patch I had already submitted. I don't know how to fix
> this one because I think you should be entitled to change your mind.
>
> A caveat to this: I did make some mistakes on rebase that needed
> addressing. ie. the ends justified the means.

Yeah, the problem is that for really big stuff like your ppgtt series
the merge process is incremental: We'll do a rough plan and then pull
in parts one-by-one. And then when the sub-series get reviewed new
things pop up. And sometimes the reviewer is simply confused and asks
for stupid things ...

I don't think we can fix this since that's just how it works. But we
can certainly keep this in mind when estimating the effort to get
features in - big stuff will have some uncertainty (and hence need for
time buffers) even after the first review. For the ppgtt work I need
to blame myself too since the original plan was way too optimistic,
but I really wanted to get this in before you get sucked away into the
next big thing lined up (which in this case unfortunately came
attached with a deadline).

> 3a. Reworking code introduces bugs. I feel I am more guilty here than
> most, but, consider even in the best case of those new bugs being
> caught in review. In such a case, you've now introduced at least 2 extra
> revs, and 2 extra lag cycles waiting for review. That assumes further
> work doesn't spiral into more requested fixups, or more bugs. In the
> less ideal case, you've simply introduced a new bug in addition to the
> delay.

I'm trying to address this by sharing rebase BKMs as much as possible.
Since I'm the one on the team doing the most rebasing (with -internal)
that hopefully helps.

> 3b. Patch splitting is art not science.
>
> There is a really delicate balance between splitting patches because
> it's logically a functional split vs. splitting things up to make things
> easier to chew on. Now in my case specifically, I think overall the
> series has improved, and I found some crud that got squashed in which
> shouldn't have been there. I also believe a lot of the splitting really
> doesn't make much sense other than for review purposes and sometimes
> that is okay.

Imo splitting patches has two functions: Make the reviewer's life
easier (not really the developers) and have simple patches in case of
a regression which bisects to it. Ime you get about a 1-in-5
regression rate in dinq, so that chance is very much neglectable. And
for the ugly regressions where we have no clue we can easily blow
through a few man-months of engineer time to track them time.

> In my case, I had a huge patch, but a lot of that patch was actually a
> sed job of "s/obj/obj,vm/."  You came back with, "you're doing a bunch
> of extra lookups." That was exactly the point of the patch; the extra
> lookups should have made the review simpler, and could be cleaned up
> later.
>
> My point is: A larger quantity of small patches is not always easier to
> review than a small quantity of large patches. Large patch series review
> often requires the reviewer to keep a lot of context as they review.

I don't mind big sed jobs or moving functions to new files (well those
quite a bit since they're a pain for rebasing -internal). But such a
big patch needs to be conceptually really simple, my rule of thumb is
that patch size times complexity should follow a constant upper limit.
So a big move stuff patch shouldn't also rename a bunch of functions
(wasn't too happy about Chris' intel_uncore.c extract) since that
makes comparing harder (both in review and in rebasing).

If the patch is really big (like driver-wide sed jobs) the conceptual
change should approach 0. For example if you want to embed an object
you first create an access helper (big sed job, no change, not even in
the struct layout). Then a 2nd patch which changes the access helper,
but would otherwise be very small.

Imo  the big patch I've asked you to split up had lot of sed-like
things, but also a few potentially functional/conceptual changes in
it. The combination was imo too much. But that doesn't mean I won't
accept sed jobs that result in a much larger diff, just that they need
to be really simple.

> *4. The result of all this is I think a lot of the time we (the
> developers) end up writing your patch for you. While I respect your
> opinion very highly, and I think more often than not that your way is
> better, it's just inefficient.

Yeah, I'm aware that sometimes I go overboard with "my way or the
highway" even if I don't state that explicitly. Often though when I
drop random ideas or ask questions I'm ok if the patch author sticks
to his way if it comes with a good explanation attached. That at least
is one of the reason why I want to always update commit messages even
when the reviewer in the end did not ask for a code change.

Todays discussion about the loop in one of your patches in
evict_everything was a prime example: I've read through your code,
decided that it looks funny and dropped a suggestion on irc. But later
on I've read the end result and noticed that my suggestion is much
worse than what you have.

In such cases I expect developers to stand up, explain why something
is like it is and tell me that I'm full of myself ;-)

This will be even more important going forward since with the growing
team and code output I'll be less and less able to keep track of
everything. So the chance that I'll utter complete bs in a review will
only increase. If you don't call me out on it we'll end up with worse
code, which I very much don't want to.

> I'll wrap this all up with, I don't envy you. On a bunch of emails, I've
> seen you be apologetic for putting developers in between a rock, and a
> hard place (you, and program management). I recognize you have the same
> dilemma with Dave/Linus, and the rest of us developers. I think the
> overall strategy should be to improve QA, but then you have to take the
> leap of limiting your requests for reworks, and accepting QAs stamp of
> approval.

Hey, overall it's actually quite a bit of fun.

I do agree that QA is really important for a fastpaced process, but
it's also not the only peace to get something in. Review (both of the
patch itself but also of  the test coverage) catches a lot of issues,
and in many cases not the same ones as QA would. Especially if the
testcoverage of a new feature is less than stellar, which imo is still
the case for gem due to the tons of finickle cornercases.

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/12] drm/i915: Use the new vm [un]bind functions
  2013-07-23 16:54   ` Daniel Vetter
@ 2013-07-26 21:48     ` Ben Widawsky
  2013-07-26 21:56       ` Daniel Vetter
  0 siblings, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-26 21:48 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 23, 2013 at 06:54:43PM +0200, Daniel Vetter wrote:
> On Sun, Jul 21, 2013 at 07:08:13PM -0700, Ben Widawsky wrote:
> > Building on the last patch which created the new function pointers in
> > the VM for bind/unbind, here we actually put those new function pointers
> > to use.
> > 
> > Split out as a separate patch to aid in review. I'm fine with squashing
> > into the previous patch if people request it.
> > 
> > v2: Updated to address the smart ggtt which can do aliasing as needed
> > Make sure we bind to global gtt when mappable and fenceable. I thought
> > we could get away without this initialy, but we cannot.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Meta review on the patch split: If you create new functions in a prep
> patch, then switch and then kill the old functions it's much harder to
> review whether any unwanted functional changes have been introduced.
> Reviewers have to essentially keep both the old and new code open and
> compare by hand.  And generally the really hard regression in gem have
> been due to such deeply-hidden accidental changes, and we frankly don't
> yet have the test coverage to just gloss over this.
> 
> If you instead first prepare the existing functions by changing the
> arguments and logic, and then once everything is in place switch over to
> vfuncs in the 2nd patch changes will be in-place. In-place changes are
> much easier to review since diff compresses away unchanged parts.
> 
> Second reason for this approach is that the functions stay at the same
> place in the source code file, which reduces the amount of spurious
> conflicts when rebasing a large set of patches around such changes ...
> 
> I need to ponder this more.
> -Daniel

ping

> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h            | 10 ------
> >  drivers/gpu/drm/i915/i915_gem.c            | 37 +++++++++------------
> >  drivers/gpu/drm/i915/i915_gem_context.c    |  7 ++--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 ++++++++--------
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        | 53 ++----------------------------
> >  5 files changed, 37 insertions(+), 99 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index f3f2825..8d6aa34 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1933,18 +1933,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >  
> >  /* i915_gem_gtt.c */
> >  void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
> > -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > -			    struct drm_i915_gem_object *obj,
> > -			    enum i915_cache_level cache_level);
> > -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > -			      struct drm_i915_gem_object *obj);
> > -
> >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > -/* FIXME: this is never okay with full PPGTT */
> > -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > -				enum i915_cache_level cache_level);
> > -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> >  void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
> >  void i915_gem_init_global_gtt(struct drm_device *dev);
> >  void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 9ea6424..63297d7 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2653,12 +2653,9 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> >  
> >  	trace_i915_gem_object_unbind(obj, vm);
> >  
> > -	if (obj->has_global_gtt_mapping && i915_is_ggtt(vm))
> > -		i915_gem_gtt_unbind_object(obj);
> > -	if (obj->has_aliasing_ppgtt_mapping) {
> > -		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> > -		obj->has_aliasing_ppgtt_mapping = 0;
> > -	}
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> > +	vm->unmap_vma(vma);
> > +
> >  	i915_gem_gtt_finish_object(obj);
> >  	i915_gem_object_unpin_pages(obj);
> >  
> > @@ -2666,7 +2663,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> >  	if (i915_is_ggtt(vm))
> >  		obj->map_and_fenceable = true;
> >  
> > -	vma = i915_gem_obj_to_vma(obj, vm);
> >  	list_del(&vma->mm_list);
> >  	list_del(&vma->vma_link);
> >  	drm_mm_remove_node(&vma->node);
> > @@ -3372,7 +3368,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  				    enum i915_cache_level cache_level)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> > -	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> >  	int ret;
> >  
> > @@ -3407,13 +3402,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  				return ret;
> >  		}
> >  
> > -		if (obj->has_global_gtt_mapping)
> > -			i915_gem_gtt_bind_object(obj, cache_level);
> > -		if (obj->has_aliasing_ppgtt_mapping)
> > -			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > -					       obj, cache_level);
> > -
> > -		i915_gem_obj_set_color(obj, vma->vm, cache_level);
> > +		vm->map_vma(vma, cache_level, 0);
> > +		i915_gem_obj_set_color(obj, vm, cache_level);
> >  	}
> >  
> >  	if (cache_level == I915_CACHE_NONE) {
> > @@ -3695,6 +3685,8 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  		    bool map_and_fenceable,
> >  		    bool nonblocking)
> >  {
> > +	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
> > +	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > @@ -3702,6 +3694,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  
> >  	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
> >  
> > +	/* FIXME: Use vma for bounds check */
> >  	if (i915_gem_obj_bound(obj, vm)) {
> >  		if ((alignment &&
> >  		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > @@ -3720,20 +3713,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	if (!i915_gem_obj_bound(obj, vm)) {
> > -		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > -
> >  		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
> >  						 map_and_fenceable,
> >  						 nonblocking);
> >  		if (ret)
> >  			return ret;
> >  
> > -		if (!dev_priv->mm.aliasing_ppgtt)
> > -			i915_gem_gtt_bind_object(obj, obj->cache_level);
> > -	}
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> > +		vm->map_vma(vma, obj->cache_level, flags);
> > +	} else
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  
> > +	/* Objects are created map and fenceable. If we bind an object
> > +	 * the first time, and we had aliasing PPGTT (and didn't request
> > +	 * GLOBAL), we'll need to do this on the second bind.*/
> >  	if (!obj->has_global_gtt_mapping && map_and_fenceable)
> > -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > +		vm->map_vma(vma, obj->cache_level, GLOBAL_BIND);
> >  
> >  	obj->pin_count++;
> >  	obj->pin_mappable |= map_and_fenceable;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 873577d..cc7c0b4 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -417,8 +417,11 @@ static int do_switch(struct i915_hw_context *to)
> >  		return ret;
> >  	}
> >  
> > -	if (!to->obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
> > +	if (!to->obj->has_global_gtt_mapping) {
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(to->obj,
> > +							   &dev_priv->gtt.base);
> > +		vma->vm->map_vma(vma, to->obj->cache_level, GLOBAL_BIND);
> > +	}
> >  
> >  	if (!to->is_initialized || is_default_context(to))
> >  		hw_flags |= MI_RESTORE_INHIBIT;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 8d2643b..6359ef2 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -197,8 +197,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  	if (unlikely(IS_GEN6(dev) &&
> >  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
> >  	    !target_i915_obj->has_global_gtt_mapping)) {
> > -		i915_gem_gtt_bind_object(target_i915_obj,
> > -					 target_i915_obj->cache_level);
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		vma->vm->map_vma(vma, target_i915_obj->cache_level,
> > +				 GLOBAL_BIND);
> >  	}
> >  
> >  	/* Validate that the target is in a valid r/w GPU domain */
> > @@ -404,10 +405,12 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  				   struct i915_address_space *vm,
> >  				   bool *need_reloc)
> >  {
> > -	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  	struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
> >  	bool need_fence, need_mappable;
> > +	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
> > +		!obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
> > +	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	need_fence =
> > @@ -421,6 +424,7 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  	if (ret)
> >  		return ret;
> >  
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> >  	entry->flags |= __EXEC_OBJECT_HAS_PIN;
> >  
> >  	if (has_fenced_gpu_access) {
> > @@ -436,14 +440,6 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		}
> >  	}
> >  
> > -	/* Ensure ppgtt mapping exists if needed */
> > -	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> > -		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > -				       obj, obj->cache_level);
> > -
> > -		obj->has_aliasing_ppgtt_mapping = 1;
> > -	}
> > -
> >  	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> >  		entry->offset = i915_gem_obj_offset(obj, vm);
> >  		*need_reloc = true;
> > @@ -454,9 +450,7 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
> >  	}
> >  
> > -	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
> > -	    !obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > +	vm->map_vma(vma, obj->cache_level, flags);
> >  
> >  	return 0;
> >  }
> > @@ -1047,8 +1041,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> >  	 * hsw should have this fixed, but let's be paranoid and do it
> >  	 * unconditionally for now. */
> > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > +	if (flags & I915_DISPATCH_SECURE &&
> > +	    !batch_obj->has_global_gtt_mapping) {
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(batch_obj, vm);
> > +		vm->map_vma(vma, batch_obj->cache_level, GLOBAL_BIND);
> > +	}
> >  
> >  	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->objects);
> >  	if (ret)
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 03e6179..1de49a0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -414,18 +414,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
> >  	dev_priv->mm.aliasing_ppgtt = NULL;
> >  }
> >  
> > -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > -			    struct drm_i915_gem_object *obj,
> > -			    enum i915_cache_level cache_level)
> > -{
> > -	struct i915_address_space *vm = &ppgtt->base;
> > -	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > -
> > -	vm->insert_entries(vm, obj->pages,
> > -			   obj_offset >> PAGE_SHIFT,
> > -			   cache_level);
> > -}
> > -
> >  static void __always_unused gen6_ppgtt_map_vma(struct i915_vma *vma,
> >  					       enum i915_cache_level cache_level,
> >  					       u32 flags)
> > @@ -437,16 +425,6 @@ static void __always_unused gen6_ppgtt_map_vma(struct i915_vma *vma,
> >  	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
> >  }
> >  
> > -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > -			      struct drm_i915_gem_object *obj)
> > -{
> > -	struct i915_address_space *vm = &ppgtt->base;
> > -	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > -
> > -	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > -			obj->base.size >> PAGE_SHIFT);
> > -}
> > -
> >  static void __always_unused gen6_ppgtt_unmap_vma(struct i915_vma *vma)
> >  {
> >  	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > @@ -507,8 +485,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> > +							   &dev_priv->gtt.base);
> >  		i915_gem_clflush_object(obj);
> > -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > +		vma->vm->map_vma(vma, obj->cache_level, 0);
> >  	}
> >  
> >  	i915_gem_chipset_flush(dev);
> > @@ -664,33 +644,6 @@ static void gen6_ggtt_map_vma(struct i915_vma *vma,
> >  	}
> >  }
> >  
> > -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > -			      enum i915_cache_level cache_level)
> > -{
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> > -
> > -	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
> > -					  entry,
> > -					  cache_level);
> > -
> > -	obj->has_global_gtt_mapping = 1;
> > -}
> > -
> > -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> > -{
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> > -
> > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -				       entry,
> > -				       obj->base.size >> PAGE_SHIFT);
> > -
> > -	obj->has_global_gtt_mapping = 0;
> > -}
> > -
> >  static void gen6_ggtt_unmap_vma(struct i915_vma *vma)
> >  {
> >  	struct drm_device *dev = vma->vm->dev;
> > -- 
> > 1.8.3.3
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/12] drm/i915: Track active by VMA instead of object
  2013-07-23 16:48   ` Daniel Vetter
@ 2013-07-26 21:48     ` Ben Widawsky
  0 siblings, 0 replies; 48+ messages in thread
From: Ben Widawsky @ 2013-07-26 21:48 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 23, 2013 at 06:48:09PM +0200, Daniel Vetter wrote:
> On Sun, Jul 21, 2013 at 07:08:11PM -0700, Ben Widawsky wrote:
> > Even though we want to be able to track active by VMA, the rest of the
> > code is still using objects for most internal APIs. To solve this,
> > create an object_is_active() function to help us in converting over to
> > VMA usage.
> > 
> > Because we intend to keep around some functions that care about objects,
> > and not VMAs, having this function around will be useful even as we
> > begin to use VMAs more in function arguments.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Still not really convinced. For access synchronization we don't care
> through which vm a bo is still access, only how (read/write) and when was
> the last access (ring + seqno).
> 
> Note that this means that the per-vm lru doesn't really need an
> active/inactive split anymore, for evict_something we only care about the
> ordering and not whether a bo is active or not. unbind() will care but I'm
> not sure that the "same bo in multiple address spaces needs to be evicted"
> use-case is something we even should care about.
> 
> So imo this commit needs a good justificatio for _why_ we want to track
> active per-vma. Atm I don't see a use-case, but I see complexity.
> -Daniel

I'm fine with deferring this until needed.

> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h            | 15 +++----
> >  drivers/gpu/drm/i915/i915_gem.c            | 64 ++++++++++++++++++------------
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
> >  3 files changed, 48 insertions(+), 33 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index f809204..bdce9c1 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -541,6 +541,13 @@ struct i915_vma {
> >  	struct drm_i915_gem_object *obj;
> >  	struct i915_address_space *vm;
> >  
> > +	/**
> > +	 * This is set if the object is on the active lists (has pending
> > +	 * rendering and so a non-zero seqno), and is not set if it i s on
> > +	 * inactive (ready to be unbound) list.
> > +	 */
> > +	unsigned int active:1;
> > +
> >  	/** This object's place on the active/inactive lists */
> >  	struct list_head mm_list;
> >  
> > @@ -1266,13 +1273,6 @@ struct drm_i915_gem_object {
> >  	struct list_head exec_list;
> >  
> >  	/**
> > -	 * This is set if the object is on the active lists (has pending
> > -	 * rendering and so a non-zero seqno), and is not set if it i s on
> > -	 * inactive (ready to be unbound) list.
> > -	 */
> > -	unsigned int active:1;
> > -
> > -	/**
> >  	 * This is set if the object has been written to since last bound
> >  	 * to the GTT
> >  	 */
> > @@ -1726,6 +1726,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
> >  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> >  			 struct intel_ring_buffer *to);
> > +bool i915_gem_object_is_active(struct drm_i915_gem_object *obj);
> >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  				    struct i915_address_space *vm,
> >  				    struct intel_ring_buffer *ring);
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 6bdf89d..9ea6424 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -119,10 +119,22 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> >  	return 0;
> >  }
> >  
> > +/* NB: Not the same as !i915_gem_object_is_inactive */
> > +bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
> > +{
> > +	struct i915_vma *vma;
> > +
> > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > +		if (vma->active)
> > +			return true;
> > +
> > +	return false;
> > +}
> > +
> >  static inline bool
> >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> >  {
> > -	return i915_gem_obj_bound_any(obj) && !obj->active;
> > +	return i915_gem_obj_bound_any(obj) && !i915_gem_object_is_active(obj);
> >  }
> >  
> >  int
> > @@ -1883,14 +1895,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  	}
> >  	obj->ring = ring;
> >  
> > +	/* Move from whatever list we were on to the tail of execution. */
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> >  	/* Add a reference if we're newly entering the active list. */
> > -	if (!obj->active) {
> > +	if (!vma->active) {
> >  		drm_gem_object_reference(&obj->base);
> > -		obj->active = 1;
> > +		vma->active = 1;
> >  	}
> >  
> > -	/* Move from whatever list we were on to the tail of execution. */
> > -	vma = i915_gem_obj_to_vma(obj, vm);
> >  	list_move_tail(&vma->mm_list, &vm->active_list);
> >  	list_move_tail(&obj->ring_list, &ring->active_list);
> >  
> > @@ -1911,16 +1923,23 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static void
> > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > -				 struct i915_address_space *vm)
> > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> >  {
> > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > +	struct i915_address_space *vm;
> >  	struct i915_vma *vma;
> > +	int i = 0;
> >  
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > -	BUG_ON(!obj->active);
> >  
> > -	vma = i915_gem_obj_to_vma(obj, vm);
> > -	list_move_tail(&vma->mm_list, &vm->inactive_list);
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (!vma || !vma->active)
> > +			continue;
> > +		list_move_tail(&vma->mm_list, &vm->inactive_list);
> > +		vma->active = 0;
> > +		i++;
> > +	}
> >  
> >  	list_del_init(&obj->ring_list);
> >  	obj->ring = NULL;
> > @@ -1932,8 +1951,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> >  	obj->last_fenced_seqno = 0;
> >  	obj->fenced_gpu_access = false;
> >  
> > -	obj->active = 0;
> > -	drm_gem_object_unreference(&obj->base);
> > +	while (i--)
> > +		drm_gem_object_unreference(&obj->base);
> >  
> >  	WARN_ON(i915_verify_lists(dev));
> >  }
> > @@ -2254,15 +2273,13 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	while (!list_empty(&ring->active_list)) {
> > -		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> >  				       struct drm_i915_gem_object,
> >  				       ring_list);
> >  
> > -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > -			i915_gem_object_move_to_inactive(obj, vm);
> > +		i915_gem_object_move_to_inactive(obj);
> >  	}
> >  }
> >  
> > @@ -2348,8 +2365,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> >  	 */
> >  	while (!list_empty(&ring->active_list)) {
> > -		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > -		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> > @@ -2359,8 +2374,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> >  			break;
> >  
> > -		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > -			i915_gem_object_move_to_inactive(obj, vm);
> > +		BUG_ON(!i915_gem_object_is_active(obj));
> > +		i915_gem_object_move_to_inactive(obj);
> >  	}
> >  
> >  	if (unlikely(ring->trace_irq_seqno &&
> > @@ -2435,7 +2450,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
> >  {
> >  	int ret;
> >  
> > -	if (obj->active) {
> > +	if (i915_gem_object_is_active(obj)) {
> >  		ret = i915_gem_check_olr(obj->ring, obj->last_read_seqno);
> >  		if (ret)
> >  			return ret;
> > @@ -2500,7 +2515,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> >  	if (ret)
> >  		goto out;
> >  
> > -	if (obj->active) {
> > +	if (i915_gem_object_is_active(obj)) {
> >  		seqno = obj->last_read_seqno;
> >  		ring = obj->ring;
> >  	}
> > @@ -3850,7 +3865,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
> >  	 */
> >  	ret = i915_gem_object_flush_active(obj);
> >  
> > -	args->busy = obj->active;
> > +	args->busy = i915_gem_object_is_active(obj);
> >  	if (obj->ring) {
> >  		BUILD_BUG_ON(I915_NUM_RINGS > 16);
> >  		args->busy |= intel_ring_flag(obj->ring) << 16;
> > @@ -4716,13 +4731,12 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  			cnt += obj->base.size >> PAGE_SHIFT;
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		if (obj->active)
> > +		if (i915_gem_object_is_active(obj))
> >  			continue;
> >  
> >  		i915_gem_object_flush_gtt_write_domain(obj);
> >  		i915_gem_object_flush_cpu_write_domain(obj);
> > -		/* FIXME: Can't assume global gtt */
> > -		i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base);
> > +		i915_gem_object_move_to_inactive(obj);
> >  
> >  		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> >  			cnt += obj->base.size >> PAGE_SHIFT;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 819d8d8..8d2643b 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -251,7 +251,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	/* We can't wait for rendering with pagefaults disabled */
> > -	if (obj->active && in_atomic())
> > +	if (i915_gem_object_is_active(obj) && in_atomic())
> >  		return -EFAULT;
> >  
> >  	reloc->delta += target_offset;
> > -- 
> > 1.8.3.3
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/12] drm/i915: Use the new vm [un]bind functions
  2013-07-26 21:48     ` Ben Widawsky
@ 2013-07-26 21:56       ` Daniel Vetter
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-07-26 21:56 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Fri, Jul 26, 2013 at 02:48:32PM -0700, Ben Widawsky wrote:
> On Tue, Jul 23, 2013 at 06:54:43PM +0200, Daniel Vetter wrote:
> > On Sun, Jul 21, 2013 at 07:08:13PM -0700, Ben Widawsky wrote:
> > > Building on the last patch which created the new function pointers in
> > > the VM for bind/unbind, here we actually put those new function pointers
> > > to use.
> > > 
> > > Split out as a separate patch to aid in review. I'm fine with squashing
> > > into the previous patch if people request it.
> > > 
> > > v2: Updated to address the smart ggtt which can do aliasing as needed
> > > Make sure we bind to global gtt when mappable and fenceable. I thought
> > > we could get away without this initialy, but we cannot.
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > Meta review on the patch split: If you create new functions in a prep
> > patch, then switch and then kill the old functions it's much harder to
> > review whether any unwanted functional changes have been introduced.
> > Reviewers have to essentially keep both the old and new code open and
> > compare by hand.  And generally the really hard regression in gem have
> > been due to such deeply-hidden accidental changes, and we frankly don't
> > yet have the test coverage to just gloss over this.
> > 
> > If you instead first prepare the existing functions by changing the
> > arguments and logic, and then once everything is in place switch over to
> > vfuncs in the 2nd patch changes will be in-place. In-place changes are
> > much easier to review since diff compresses away unchanged parts.
> > 
> > Second reason for this approach is that the functions stay at the same
> > place in the source code file, which reduces the amount of spurious
> > conflicts when rebasing a large set of patches around such changes ...
> > 
> > I need to ponder this more.
> > -Daniel
> 
> ping

Keep it in mind for next time around. I think my general approach is
easier on reviewers ... but hey, vacation!
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26 20:43       ` Daniel Vetter
@ 2013-07-26 23:13         ` Dave Airlie
  2013-07-27  0:05           ` Ben Widawsky
  2013-07-29 22:35           ` Jesse Barnes
  0 siblings, 2 replies; 48+ messages in thread
From: Dave Airlie @ 2013-07-26 23:13 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Ben Widawsky, Intel GFX

>
> Hey, overall it's actually quite a bit of fun.
>
> I do agree that QA is really important for a fastpaced process, but
> it's also not the only peace to get something in. Review (both of the
> patch itself but also of  the test coverage) catches a lot of issues,
> and in many cases not the same ones as QA would. Especially if the
> testcoverage of a new feature is less than stellar, which imo is still
> the case for gem due to the tons of finickle cornercases.

Just my 2c worth on this topic, since I like the current process, and
I believe making it too formal is probably going to make things suck
too much.

I'd rather Daniel was slowing you guys down up front more, I don't
give a crap about Intel project management or personal manager relying
on getting features merged when, I do care that you engineers when you
merge something generally get transferred 100% onto something else and
don't react strongly enough to issues on older code you have created
that either have lain dormant since patches merged or are regressions
since patches merged. So I believe the slowing down of merging
features gives a better chance of QA or other random devs of finding
the misc regressions while you are still focused on the code and
hitting the long term bugs that you guys rarely get resourced to fix
unless I threaten to stop pulling stuff.

So whatever Daniel says goes as far as I'm concerned, if I even
suspect he's taken some internal Intel pressure to merge some feature,
I'm going to stop pulling from him faster than I stopped pulling from
the previous maintainers :-), so yeah engineers should be prepared to
backup what they post even if Daniel is wrong, but on the other hand
they need to demonstrate they understand the code they are pushing and
sometimes with ppgtt and contexts I'm not sure anyone really
understands how the hw works let alone the sw :-P

Dave.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26 23:13         ` Dave Airlie
@ 2013-07-27  0:05           ` Ben Widawsky
  2013-07-27  8:52             ` Dave Airlie
  2013-07-29 22:35           ` Jesse Barnes
  1 sibling, 1 reply; 48+ messages in thread
From: Ben Widawsky @ 2013-07-27  0:05 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Intel GFX

On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote:
> >
> > Hey, overall it's actually quite a bit of fun.
> >
> > I do agree that QA is really important for a fastpaced process, but
> > it's also not the only peace to get something in. Review (both of the
> > patch itself but also of  the test coverage) catches a lot of issues,
> > and in many cases not the same ones as QA would. Especially if the
> > testcoverage of a new feature is less than stellar, which imo is still
> > the case for gem due to the tons of finickle cornercases.
> 
> Just my 2c worth on this topic, since I like the current process, and
> I believe making it too formal is probably going to make things suck
> too much.
> 
> I'd rather Daniel was slowing you guys down up front more, I don't
> give a crap about Intel project management or personal manager relying
> on getting features merged when, I do care that you engineers when you
> merge something generally get transferred 100% onto something else and
> don't react strongly enough to issues on older code you have created
> that either have lain dormant since patches merged or are regressions
> since patches merged. So I believe the slowing down of merging
> features gives a better chance of QA or other random devs of finding
> the misc regressions while you are still focused on the code and
> hitting the long term bugs that you guys rarely get resourced to fix
> unless I threaten to stop pulling stuff.
> 
> So whatever Daniel says goes as far as I'm concerned, if I even
> suspect he's taken some internal Intel pressure to merge some feature,
> I'm going to stop pulling from him faster than I stopped pulling from
> the previous maintainers :-), so yeah engineers should be prepared to
> backup what they post even if Daniel is wrong, but on the other hand
> they need to demonstrate they understand the code they are pushing and
> sometimes with ppgtt and contexts I'm not sure anyone really
> understands how the hw works let alone the sw :-P
> 
> Dave.

Honestly, I wouldn't have responded if you didn't mention the Intel
program management thing...

The problem I am trying to emphasize, and let's use contexts/ppgtt as an
example, is we have three options:
1. It's complicated, and a big change, so let's not do it.
2. I continue to rebase the massive change on top of the extremely fast
paced i915 tree, with no QA coverage.
3. We get decent bits merged ASAP by putting it in a repo that both gets
much wider usage than my personal branch, and gets nightly QA coverage.

PPGTT + Contexts have existed for a while, and so we went with #1 for
quite a while.

Now we're at #2. There's two sides to your 'developer needs to
defend...' I need Daniel to give succinct feedback, and agree upon steps
required to get code merged. My original gripe was that it's hard to
deal with the, "that patch is too big" comments almost 2 months after
the first version was sent. Equally, "that looks funny" without a real
explanation of what looks funny, or sufficient thought up front about
what might look better is just as hard to deal with. Inevitably, yes -
it's a big scary series of patches - but if we're honest with ourselves,
it's almost guaranteed to blow up somewhere regardless of how much we
rework it, and who reviews it. Blowing up long before you merge would
always be better than the after you merge.

My desire is to get to something like #3. I had a really long paragraph
on why and how we could do that, but I've redacted it. Let's just leave
it as, I think that should be the goal.

Finally, let me clear that none of the discussion I'm having with Daniel
that spawned this thread are inspired by Intel program management. My
personal opinion is that your firm stance has really helped us
internally to fight back stupid decisions. Honestly, I wish you had a
more direct input into our management, and product planners.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-27  0:05           ` Ben Widawsky
@ 2013-07-27  8:52             ` Dave Airlie
  2013-08-04 19:55               ` Daniel Vetter
  0 siblings, 1 reply; 48+ messages in thread
From: Dave Airlie @ 2013-07-27  8:52 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sat, Jul 27, 2013 at 10:05 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote:
>> >
>> > Hey, overall it's actually quite a bit of fun.
>> >
>> > I do agree that QA is really important for a fastpaced process, but
>> > it's also not the only peace to get something in. Review (both of the
>> > patch itself but also of  the test coverage) catches a lot of issues,
>> > and in many cases not the same ones as QA would. Especially if the
>> > testcoverage of a new feature is less than stellar, which imo is still
>> > the case for gem due to the tons of finickle cornercases.
>>
>> Just my 2c worth on this topic, since I like the current process, and
>> I believe making it too formal is probably going to make things suck
>> too much.
>>
>> I'd rather Daniel was slowing you guys down up front more, I don't
>> give a crap about Intel project management or personal manager relying
>> on getting features merged when, I do care that you engineers when you
>> merge something generally get transferred 100% onto something else and
>> don't react strongly enough to issues on older code you have created
>> that either have lain dormant since patches merged or are regressions
>> since patches merged. So I believe the slowing down of merging
>> features gives a better chance of QA or other random devs of finding
>> the misc regressions while you are still focused on the code and
>> hitting the long term bugs that you guys rarely get resourced to fix
>> unless I threaten to stop pulling stuff.
>>
>> So whatever Daniel says goes as far as I'm concerned, if I even
>> suspect he's taken some internal Intel pressure to merge some feature,
>> I'm going to stop pulling from him faster than I stopped pulling from
>> the previous maintainers :-), so yeah engineers should be prepared to
>> backup what they post even if Daniel is wrong, but on the other hand
>> they need to demonstrate they understand the code they are pushing and
>> sometimes with ppgtt and contexts I'm not sure anyone really
>> understands how the hw works let alone the sw :-P
>>
>> Dave.
>
> Honestly, I wouldn't have responded if you didn't mention the Intel
> program management thing...
>
> The problem I am trying to emphasize, and let's use contexts/ppgtt as an
> example, is we have three options:
> 1. It's complicated, and a big change, so let's not do it.
> 2. I continue to rebase the massive change on top of the extremely fast
> paced i915 tree, with no QA coverage.
> 3. We get decent bits merged ASAP by putting it in a repo that both gets
> much wider usage than my personal branch, and gets nightly QA coverage.
>
> PPGTT + Contexts have existed for a while, and so we went with #1 for
> quite a while.
>
> Now we're at #2. There's two sides to your 'developer needs to
> defend...' I need Daniel to give succinct feedback, and agree upon steps
> required to get code merged. My original gripe was that it's hard to
> deal with the, "that patch is too big" comments almost 2 months after
> the first version was sent. Equally, "that looks funny" without a real
> explanation of what looks funny, or sufficient thought up front about
> what might look better is just as hard to deal with. Inevitably, yes -
> it's a big scary series of patches - but if we're honest with ourselves,
> it's almost guaranteed to blow up somewhere regardless of how much we
> rework it, and who reviews it. Blowing up long before you merge would
> always be better than the after you merge.
>
> My desire is to get to something like #3. I had a really long paragraph
> on why and how we could do that, but I've redacted it. Let's just leave
> it as, I think that should be the goal.
>

Daniel could start taking topic branches like Ingo does, however he'd
have a lot of fun merging them,
he's already getting closer and closer to the extreme stuff -tip does,
and he'd have to feed the topics to QA and possibly -next separately,
the question is when to include a branch or not include it.

Maybe he can schedule a time that QA gets all the branches, and maybe
not put stuff into -next until we are sure its on its way.

Dave.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26 23:13         ` Dave Airlie
  2013-07-27  0:05           ` Ben Widawsky
@ 2013-07-29 22:35           ` Jesse Barnes
  2013-07-29 23:50             ` Dave Airlie
  1 sibling, 1 reply; 48+ messages in thread
From: Jesse Barnes @ 2013-07-29 22:35 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Ben Widawsky, Intel GFX

On Sat, 27 Jul 2013 09:13:38 +1000
Dave Airlie <airlied@gmail.com> wrote:

> >
> > Hey, overall it's actually quite a bit of fun.
> >
> > I do agree that QA is really important for a fastpaced process, but
> > it's also not the only peace to get something in. Review (both of the
> > patch itself but also of  the test coverage) catches a lot of issues,
> > and in many cases not the same ones as QA would. Especially if the
> > testcoverage of a new feature is less than stellar, which imo is still
> > the case for gem due to the tons of finickle cornercases.
> 
> Just my 2c worth on this topic, since I like the current process, and
> I believe making it too formal is probably going to make things suck
> too much.
> 
> I'd rather Daniel was slowing you guys down up front more, I don't
> give a crap about Intel project management or personal manager relying
> on getting features merged when, I do care that you engineers when you
> merge something generally get transferred 100% onto something else and
> don't react strongly enough to issues on older code you have created
> that either have lain dormant since patches merged or are regressions
> since patches merged. So I believe the slowing down of merging
> features gives a better chance of QA or other random devs of finding
> the misc regressions while you are still focused on the code and
> hitting the long term bugs that you guys rarely get resourced to fix
> unless I threaten to stop pulling stuff.
> 
> So whatever Daniel says goes as far as I'm concerned, if I even
> suspect he's taken some internal Intel pressure to merge some feature,
> I'm going to stop pulling from him faster than I stopped pulling from
> the previous maintainers :-), so yeah engineers should be prepared to
> backup what they post even if Daniel is wrong, but on the other hand
> they need to demonstrate they understand the code they are pushing and
> sometimes with ppgtt and contexts I'm not sure anyone really
> understands how the hw works let alone the sw :-P

Some of this is driven by me, because I have one main goal in mind in
getting our code upstream: I want high quality kernel support for our
products upstream and released, in an official Linus release, before the
product ships.  That gives OSVs and other downstream consumers of the
code a chance to get the bits and be ready when products start rolling
out.

Without a bounded time process for getting bits upstream, that can't
happen.  That's why I was trying to encourage reviewers to provide
specific feedback, since vague feedback is more likely to leave a
patchset in the doldrums and de-motivate the author.

I think the "slowing things down" may hurt more than it helps here.
For example all the time Paulo spends on refactoring and rebasing his
PC8 stuff is time he could have spent on HSW bugs instead.  Likewise
with Ben's stuff (and there the rebasing is actually reducing quality
rather than increasing it, at least from a bug perspective).

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-29 22:35           ` Jesse Barnes
@ 2013-07-29 23:50             ` Dave Airlie
  2013-08-04 20:17               ` Daniel Vetter
  0 siblings, 1 reply; 48+ messages in thread
From: Dave Airlie @ 2013-07-29 23:50 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Ben Widawsky, Intel GFX

>> > I do agree that QA is really important for a fastpaced process, but
>> > it's also not the only peace to get something in. Review (both of the
>> > patch itself but also of  the test coverage) catches a lot of issues,
>> > and in many cases not the same ones as QA would. Especially if the
>> > testcoverage of a new feature is less than stellar, which imo is still
>> > the case for gem due to the tons of finickle cornercases.
>>
>> Just my 2c worth on this topic, since I like the current process, and
>> I believe making it too formal is probably going to make things suck
>> too much.
>>
>> I'd rather Daniel was slowing you guys down up front more, I don't
>> give a crap about Intel project management or personal manager relying
>> on getting features merged when, I do care that you engineers when you
>> merge something generally get transferred 100% onto something else and
>> don't react strongly enough to issues on older code you have created
>> that either have lain dormant since patches merged or are regressions
>> since patches merged. So I believe the slowing down of merging
>> features gives a better chance of QA or other random devs of finding
>> the misc regressions while you are still focused on the code and
>> hitting the long term bugs that you guys rarely get resourced to fix
>> unless I threaten to stop pulling stuff.
>>
>> So whatever Daniel says goes as far as I'm concerned, if I even
>> suspect he's taken some internal Intel pressure to merge some feature,
>> I'm going to stop pulling from him faster than I stopped pulling from
>> the previous maintainers :-), so yeah engineers should be prepared to
>> backup what they post even if Daniel is wrong, but on the other hand
>> they need to demonstrate they understand the code they are pushing and
>> sometimes with ppgtt and contexts I'm not sure anyone really
>> understands how the hw works let alone the sw :-P
>
> Some of this is driven by me, because I have one main goal in mind in
> getting our code upstream: I want high quality kernel support for our
> products upstream and released, in an official Linus release, before the
> product ships.  That gives OSVs and other downstream consumers of the
> code a chance to get the bits and be ready when products start rolling
> out.

Your main goal is however different than mine, my main goal is to
not regress the code that is already upstream and have bugs in it
fixed. Slowing down new platform merges seems to do that a lot
better than merging stuff :-)

I realise you guys pay lip service to my goals at times, but I often
get the feeling that you'd rather merge HSW support and run away
to the next platform than spend a lot of time fixing reported bugs in
Ironlake/Sandybridge/Ivybridge *cough RC6 after suspend/resume*.

It would be nice to be proven wrong once in a while where someone is
actually assigned a bug fix in preference to adding new features for new
platforms.

Dave.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-27  8:52             ` Dave Airlie
@ 2013-08-04 19:55               ` Daniel Vetter
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-08-04 19:55 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Ben Widawsky, Intel GFX

On Sat, Jul 27, 2013 at 06:52:55PM +1000, Dave Airlie wrote:
> On Sat, Jul 27, 2013 at 10:05 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote:
> >> >
> >> > Hey, overall it's actually quite a bit of fun.
> >> >
> >> > I do agree that QA is really important for a fastpaced process, but
> >> > it's also not the only peace to get something in. Review (both of the
> >> > patch itself but also of  the test coverage) catches a lot of issues,
> >> > and in many cases not the same ones as QA would. Especially if the
> >> > testcoverage of a new feature is less than stellar, which imo is still
> >> > the case for gem due to the tons of finickle cornercases.
> >>
> >> Just my 2c worth on this topic, since I like the current process, and
> >> I believe making it too formal is probably going to make things suck
> >> too much.
> >>
> >> I'd rather Daniel was slowing you guys down up front more, I don't
> >> give a crap about Intel project management or personal manager relying
> >> on getting features merged when, I do care that you engineers when you
> >> merge something generally get transferred 100% onto something else and
> >> don't react strongly enough to issues on older code you have created
> >> that either have lain dormant since patches merged or are regressions
> >> since patches merged. So I believe the slowing down of merging
> >> features gives a better chance of QA or other random devs of finding
> >> the misc regressions while you are still focused on the code and
> >> hitting the long term bugs that you guys rarely get resourced to fix
> >> unless I threaten to stop pulling stuff.
> >>
> >> So whatever Daniel says goes as far as I'm concerned, if I even
> >> suspect he's taken some internal Intel pressure to merge some feature,
> >> I'm going to stop pulling from him faster than I stopped pulling from
> >> the previous maintainers :-), so yeah engineers should be prepared to
> >> backup what they post even if Daniel is wrong, but on the other hand
> >> they need to demonstrate they understand the code they are pushing and
> >> sometimes with ppgtt and contexts I'm not sure anyone really
> >> understands how the hw works let alone the sw :-P
> >>
> >> Dave.
> >
> > Honestly, I wouldn't have responded if you didn't mention the Intel
> > program management thing...
> >
> > The problem I am trying to emphasize, and let's use contexts/ppgtt as an
> > example, is we have three options:
> > 1. It's complicated, and a big change, so let's not do it.
> > 2. I continue to rebase the massive change on top of the extremely fast
> > paced i915 tree, with no QA coverage.
> > 3. We get decent bits merged ASAP by putting it in a repo that both gets
> > much wider usage than my personal branch, and gets nightly QA coverage.
> >
> > PPGTT + Contexts have existed for a while, and so we went with #1 for
> > quite a while.
> >
> > Now we're at #2. There's two sides to your 'developer needs to
> > defend...' I need Daniel to give succinct feedback, and agree upon steps
> > required to get code merged. My original gripe was that it's hard to
> > deal with the, "that patch is too big" comments almost 2 months after
> > the first version was sent. Equally, "that looks funny" without a real
> > explanation of what looks funny, or sufficient thought up front about
> > what might look better is just as hard to deal with. Inevitably, yes -
> > it's a big scary series of patches - but if we're honest with ourselves,
> > it's almost guaranteed to blow up somewhere regardless of how much we
> > rework it, and who reviews it. Blowing up long before you merge would
> > always be better than the after you merge.
> >
> > My desire is to get to something like #3. I had a really long paragraph
> > on why and how we could do that, but I've redacted it. Let's just leave
> > it as, I think that should be the goal.
> >
> 
> Daniel could start taking topic branches like Ingo does, however he'd
> have a lot of fun merging them,
> he's already getting closer and closer to the extreme stuff -tip does,
> and he'd have to feed the topics to QA and possibly -next separately,
> the question is when to include a branch or not include it.

Yeah, I guess eventually we need to go more crazy with the branching model
for drm/i915. But even getting to the current model was quite some fun, so
I don't want to rock the boat too much if not required ;-)

Also I fear that integrating random developer branches myself will put me
at an ugly spot where I partially maintain (due to the regular merge
conflicts) patches I haven't yet accepted. And since I'm only human I'll
then just merge patches to get rid of the merge pain. So I don't really
want to do that.

Similarly for the internal tree (which just contains hw enabling for
platforms we're not yet allowed to talk about and some related hacks) I've
put down the rule that I won't take patches which are not upstream
material (minus the last bit of polish and no real review requirement).
Otherwise I'll start to bend the upstream rules a bit ... ;-)
 
> Maybe he can schedule a time that QA gets all the branches, and maybe
> not put stuff into -next until we are sure its on its way.

Imo the solution here is for QA to beat the nightly test infrastructure
into a solid enough shape that it can run arbitrary developer branches,
unattended. I think we're slowly getting there (but for obvious reasons
that's no my main aim as the maintainer when working together with our QA
guys).

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-29 23:50             ` Dave Airlie
@ 2013-08-04 20:17               ` Daniel Vetter
  2013-08-05 21:33                 ` Jesse Barnes
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Vetter @ 2013-08-04 20:17 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Ben Widawsky, Intel GFX

The nice thing with kicking off a process discussion before disappearing
into vacation is that I've had a long time to come up with some
well-sharpened opinions. And what better way to start than with a good
old-fashioned flamewar ;-)

On Tue, Jul 30, 2013 at 09:50:21AM +1000, Dave Airlie wrote:
> >> > I do agree that QA is really important for a fastpaced process, but
> >> > it's also not the only peace to get something in. Review (both of the
> >> > patch itself but also of  the test coverage) catches a lot of issues,
> >> > and in many cases not the same ones as QA would. Especially if the
> >> > testcoverage of a new feature is less than stellar, which imo is still
> >> > the case for gem due to the tons of finickle cornercases.
> >>
> >> Just my 2c worth on this topic, since I like the current process, and
> >> I believe making it too formal is probably going to make things suck
> >> too much.
> >>
> >> I'd rather Daniel was slowing you guys down up front more, I don't
> >> give a crap about Intel project management or personal manager relying
> >> on getting features merged when, I do care that you engineers when you
> >> merge something generally get transferred 100% onto something else and
> >> don't react strongly enough to issues on older code you have created
> >> that either have lain dormant since patches merged or are regressions
> >> since patches merged. So I believe the slowing down of merging
> >> features gives a better chance of QA or other random devs of finding
> >> the misc regressions while you are still focused on the code and
> >> hitting the long term bugs that you guys rarely get resourced to fix
> >> unless I threaten to stop pulling stuff.
> >>
> >> So whatever Daniel says goes as far as I'm concerned, if I even
> >> suspect he's taken some internal Intel pressure to merge some feature,
> >> I'm going to stop pulling from him faster than I stopped pulling from
> >> the previous maintainers :-), so yeah engineers should be prepared to
> >> backup what they post even if Daniel is wrong, but on the other hand
> >> they need to demonstrate they understand the code they are pushing and
> >> sometimes with ppgtt and contexts I'm not sure anyone really
> >> understands how the hw works let alone the sw :-P
> >
> > Some of this is driven by me, because I have one main goal in mind in
> > getting our code upstream: I want high quality kernel support for our
> > products upstream and released, in an official Linus release, before the
> > product ships.  That gives OSVs and other downstream consumers of the
> > code a chance to get the bits and be ready when products start rolling
> > out.

Imo the "unpredictable upstream" vs. "high quality kernel support in
upstream" is a false dichotomy. Afaics the "unpredictability" is _because_
I am not willing to compromise on decent quality. I still claim that
upstreaming is a fairly predictable thing (whithin some bounds of how well
some tasks can be estimated up-front without doing some research or
prototyping), and the blocker here is our mediocre project tracking.

I've thought a bit about this (and read a few provoking books about the
matter) over vacation and I fear I get to demonstrate this only by running
the estimation show myself a bit. But atm I'm by far not frustrated enough
yet with the current state of affairs to sign up for that - still chewing
on that maintainer thing ;-)

> Your main goal is however different than mine, my main goal is to
> not regress the code that is already upstream and have bugs in it
> fixed. Slowing down new platform merges seems to do that a lot
> better than merging stuff :-)
> 
> I realise you guys pay lip service to my goals at times, but I often
> get the feeling that you'd rather merge HSW support and run away
> to the next platform than spend a lot of time fixing reported bugs in
> Ironlake/Sandybridge/Ivybridge *cough RC6 after suspend/resume*.
> 
> It would be nice to be proven wrong once in a while where someone is
> actually assigned a bug fix in preference to adding new features for new
> platforms.

Well, that team is 50% Chris&me with other people (many from the community
...) rounding things off. That is quite a bit better than a year ago (and
yep, we blow up stuff, too) but not great. And it's imo also true that
Intel as a company doesn't care one bit once the hw is shipped.

My approach here has been to be a royal jerk about test coverage for new
features and blocking stuff if a regression isn't tackled in time. People
scream all around, but it seems to work and we're imo getting to a "farly
decent regression handling" point. I also try to push for enabling
features across platforms (if the hw should work the same way) in the name
of increased test coverage. That one seems to be less effective (e.g. fbc
for hsw only ...).

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-07-26 17:12         ` Jesse Barnes
@ 2013-08-04 20:31           ` Daniel Vetter
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-08-04 20:31 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Ben Widawsky, Intel GFX

On Fri, Jul 26, 2013 at 10:12:43AM -0700, Jesse Barnes wrote:
> On Fri, 26 Jul 2013 18:08:48 +0100
> Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote:
> > >   4) review comments should be concrete and actionable, and ideally not
> > >      leave the author hanging with hints about problems the reviewer
> > >      has spotted, leaving the author looking for easter eggs
> > 
> > Where am I going to find my fun, if I am not allowed to tell you that
> > you missed a zero in a thousand line patch but not tell you where?
> > Spoilsport :-p
> 
> You'll just need to take up golf or something. :)

Poignant opinion from the guy who bored himself on vacations: I disagree
on two grounds:

Chris without the occasional easter-egg sprinkling just wouldn't be Chris
anymore, at least how I know him. Imo we're a bunch of individuals, quirks
and all, not a pile of interchangeable cogs that just churn out code. And
yes am as amused as the next guy when I spoil by pants by inadvertedly
sitting in one of Chris' easter-eggs, otoh I can't help not grinning when
I discover them in time ;-)

Which leads to the "where's the fun?" question. I've started hacking on
drm/i915 because it's fun (despite the frustration). And the fun is what
keeps me slogging through bug reports each morning. So if we ditch that in
the name of efficiency that'll affect my productivity a lot (just not in
the intended direction) and you'll probably need to look for a new
maintainer ...

With that out of the way I'm obviously not advocating for unclear review -
mail is an occasional rather lossy communication medium and we need to
keep that in mind all the time. I'm only against your easter egg comment,
since throwing those out with the badwather is imo bad.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-08-04 20:17               ` Daniel Vetter
@ 2013-08-05 21:33                 ` Jesse Barnes
  2013-08-05 22:19                   ` Daniel Vetter
  0 siblings, 1 reply; 48+ messages in thread
From: Jesse Barnes @ 2013-08-05 21:33 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Ben Widawsky, Intel GFX

On Sun, 4 Aug 2013 22:17:47 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:
> Imo the "unpredictable upstream" vs. "high quality kernel support in
> upstream" is a false dichotomy. Afaics the "unpredictability" is _because_
> I am not willing to compromise on decent quality. I still claim that
> upstreaming is a fairly predictable thing (whithin some bounds of how well
> some tasks can be estimated up-front without doing some research or
> prototyping), and the blocker here is our mediocre project tracking.

Well, I definitely disagree here.  With our current (and recent past)
processes, we've generally ended up with lots of hw support landing
well after parts start shipping, and the quality hasn't been high (in
terms of user reported bugs) despite all the delay.  So while our code
might look pretty, the fact is that it's late, and has hard to debug
low level bugs (RC6, semaphores, etc).

<rant>
It's fairly easy to add support for hardware well after it ships, and
in a substandard way (e.g. hard power features disabled because we
can't figure them out because the hw debug folks have moved on).  If we
want to keep doing that, fine, but I'd really like us to do better and
catch the hard bugs *before* hw ships, and make sure it's solid and
complete *before* users get it.  But maybe that's just me.  Maybe
treating our driver like any other RE or "best effort" Linux driver is
the right way to go.  If so, fine, let's just not change anything.
</rant>

> My approach here has been to be a royal jerk about test coverage for new
> features and blocking stuff if a regression isn't tackled in time. People
> scream all around, but it seems to work and we're imo getting to a "farly
> decent regression handling" point. I also try to push for enabling
> features across platforms (if the hw should work the same way) in the name
> of increased test coverage. That one seems to be less effective (e.g. fbc
> for hsw only ...).

But code that isn't upstream *WON'T BE TESTED* reasonably.  So if
you're waiting for all tests to be written before going upstream, all
you're doing is delaying the bug reports that will inevitably come in,
both from new test programs and from general usage. On top of that, if
someone is trying to refactor at the same time, things just become a
mess with all sorts of regressions introduced that weren't an issue
with the original patchset...

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-08-05 21:33                 ` Jesse Barnes
@ 2013-08-05 22:19                   ` Daniel Vetter
  2013-08-05 23:34                     ` Jesse Barnes
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Vetter @ 2013-08-05 22:19 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Ben Widawsky, Intel GFX

On Mon, Aug 5, 2013 at 11:33 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> On Sun, 4 Aug 2013 22:17:47 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
>> Imo the "unpredictable upstream" vs. "high quality kernel support in
>> upstream" is a false dichotomy. Afaics the "unpredictability" is _because_
>> I am not willing to compromise on decent quality. I still claim that
>> upstreaming is a fairly predictable thing (whithin some bounds of how well
>> some tasks can be estimated up-front without doing some research or
>> prototyping), and the blocker here is our mediocre project tracking.
>
> Well, I definitely disagree here.  With our current (and recent past)
> processes, we've generally ended up with lots of hw support landing
> well after parts start shipping, and the quality hasn't been high (in
> terms of user reported bugs) despite all the delay.  So while our code
> might look pretty, the fact is that it's late, and has hard to debug
> low level bugs (RC6, semaphores, etc).
>
> <rant>
> It's fairly easy to add support for hardware well after it ships, and
> in a substandard way (e.g. hard power features disabled because we
> can't figure them out because the hw debug folks have moved on).  If we
> want to keep doing that, fine, but I'd really like us to do better and
> catch the hard bugs *before* hw ships, and make sure it's solid and
> complete *before* users get it.  But maybe that's just me.  Maybe
> treating our driver like any other RE or "best effort" Linux driver is
> the right way to go.  If so, fine, let's just not change anything.
> </rant>

The only thing I read here, both in the paragraph above and in the
rant is that we suck. I agree. My opinion is that this is because
we've started late, had too few resources and didn't seriously
estimate how much work is actually involved to enable something for
real.

The only reason I could distill from the above two paragraphs among
the ranting for way we are so much late is "So while our code might
look pretty, it's late and buggy". That's imo a farily shallow stab at
preceived bikesheds, but not a useful angle to improve process.

Now I agree that I uphold a fairly high quality standard for
upstreaming, but not an unreasonable one:
- drm/i915 transformed from the undisputed shittiest driver in the
kernel to one that mostly just works, while picking up development
pace. So I don't think I'm fully wrong on insisting on this level of
quality.
- we do ship the driver essentially continously, which means we can
implement features only by small refactoring steps. That clearly
involves more work than just stitching something together for a
product.

I welcome discussing whether I impose t0o high standards, but that
needs to be supplied with examples and solid reasons. "It's just too
hard" without more context isn't one, since yes, the work we pull off
here actually is hard.

Also note that Chris&me still bear the brute of fixing the random
fallout all over (it's getting better). So if any proposed changes
involves me blowing through even more time to track down issues I'm
strongly not in favour. Same holds for Chris often-heard comment that
a patch needs an improved commit message or a comment somewhere. Yes
it's annoying that you need to resend it (this often bugs me myself)
just to paint the bikeshed a bit different. But imo Chris is pretty
much throughout spot-on with his requests and a high-quality git
history has, in my experience at least, been extremely valueable to
track down the really ugly issues and legalese around all the
established precendence.

>> My approach here has been to be a royal jerk about test coverage for new
>> features and blocking stuff if a regression isn't tackled in time. People
>> scream all around, but it seems to work and we're imo getting to a "farly
>> decent regression handling" point. I also try to push for enabling
>> features across platforms (if the hw should work the same way) in the name
>> of increased test coverage. That one seems to be less effective (e.g. fbc
>> for hsw only ...).
>
> But code that isn't upstream *WON'T BE TESTED* reasonably.  So if
> you're waiting for all tests to be written before going upstream, all
> you're doing is delaying the bug reports that will inevitably come in,
> both from new test programs and from general usage. On top of that, if
> someone is trying to refactor at the same time, things just become a
> mess with all sorts of regressions introduced that weren't an issue
> with the original patchset...

QA on my trees and the igt testcoverage I demand for new features is
to catch regressions once something is merged. We've managed to break
code in less than a day since it's merged on multiple occasions, so
this is very real and just part of the quality standard I impose.

Furthermore I don't want that a new feature regresses overall
stability of our driver. And since that quality is increasing rather
decently I ask for more testcases to exercise cornercases to make sure
they're all covered. This is very much orthogonal to doing review and
just one more puzzle to ensure we don't go back to the neat old days
of shipping half-baked crap.

Note that nowadays QA is catching a lot of the regressions even before
the patches land in Dave's tree (sometimes there's the occasional
brown paper bag event though, but in each such case I analysis the
failure mode and work to prevent it in the future). And imo that's
squarely due to much improved test coverage and the rigid test
coverage requirements for new feautures I impose. And of course the
overall improve QA process flow with much quicker regression
turnaround times also greatly helps here.

Now I agree (and I think I've mentioned this a bunch of times in this
thread already) that this leads to a pain for developers. I see two
main issues, both are (slowly) improving:
- Testing patch series for regressions before merging. QA just set up
the developer patch test system, and despite that it's still rather
limited Ben seems to be fairly happy with where it's going. So I think
we're on track to improve this and avoid the need for developers to
have a private lab like Chris and I essentially have.
- Rebase hell due to ongoing other work. Thus far I've only tried to
help here by rechecking/delaying refactoring patches while big
features are pending. I think we need to try new approaches here and
imo better planing should help. E.g. the initial modeset refactor was
way too big and a monolithic junk that I've just wrestled in by
exorting r-b tags from you. In contrast the pipe config rework was
about equally big, but at any given time only about 30-50 patches
where outstanding (in extreme cases), and mutliple people contributed
different parts of the overall beast. Of course that means that
occasional, for really big stuff, we need to plan to write a first
proof of concept as a landmark where we need to go to, which pretty
much will be thrown away completely.

One meta-comment on top of the actual discussion: I really appreciate
critique and I've grown a good maintainer-skin to also deal with
really harsh critique. But I prefer less ranting and more concrete
examples where I've botched the job (there are plentiful to pick from
imo) and concrete suggestion for how to improve our overall process. I
think these process woes are painful for everyone and due to our fast
growth we're constantly pushing into new levels of ugly, but imo the
way to go forward is by small (sometimes positively tiny), but
continous adjustements and improvements.

I think we both agree where we'd like to be, but at least for me in
the day-to-day fight in the trenches the rosy picture 200 miles away
doesn't really help. Maybe I'm too delusional and sarcastic that way
;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-08-05 22:19                   ` Daniel Vetter
@ 2013-08-05 23:34                     ` Jesse Barnes
  2013-08-06  6:29                       ` Daniel Vetter
  0 siblings, 1 reply; 48+ messages in thread
From: Jesse Barnes @ 2013-08-05 23:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Ben Widawsky, Intel GFX

On Tue, 6 Aug 2013 00:19:33 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:
> The only thing I read here, both in the paragraph above and in the
> rant is that we suck. I agree. My opinion is that this is because
> we've started late, had too few resources and didn't seriously
> estimate how much work is actually involved to enable something for
> real.

No, it's more than that, we suck in very specific ways:
  1) large (and sometimes even small) features take waaay too long to
     land upstream, taking valuable developer time away from other
     things like bug fixing, regressions, etc
  2) hw support lands late, which makes it harder to get debug traction
     with tough bugs (e.g. RC6)

> 
> The only reason I could distill from the above two paragraphs among
> the ranting for way we are so much late is "So while our code might
> look pretty, it's late and buggy". That's imo a farily shallow stab at
> preceived bikesheds, but not a useful angle to improve process.

No, I suggested improvements to our process earlier, and it sounded
like you mostly agreed, though seemed to deny point that we spin for
too long on things (point #1 above).

> Now I agree that I uphold a fairly high quality standard for
> upstreaming, but not an unreasonable one:
> - drm/i915 transformed from the undisputed shittiest driver in the
> kernel to one that mostly just works, while picking up development
> pace. So I don't think I'm fully wrong on insisting on this level of
> quality.
> - we do ship the driver essentially continously, which means we can
> implement features only by small refactoring steps. That clearly
> involves more work than just stitching something together for a
> product.

<sarcasm>
You're way off base here.  We should ship a shitty driver and just land
everything without review or testing.  That way we can go really fast.
Your quality standards are too high (in that they exist at all).
</sarcasm>

More seriously, quality should be measured by the end result in terms
of bugs and how users actually use our stuff.  I'm not sure if that's
what you mean by a "high quality standard".  Sometimes it seems you
care more about refactoring things ad-infinitum than tested code.

> Also note that Chris&me still bear the brute of fixing the random
> fallout all over (it's getting better). So if any proposed changes
> involves me blowing through even more time to track down issues I'm
> strongly not in favour. Same holds for Chris often-heard comment that
> a patch needs an improved commit message or a comment somewhere. Yes
> it's annoying that you need to resend it (this often bugs me myself)
> just to paint the bikeshed a bit different. But imo Chris is pretty
> much throughout spot-on with his requests and a high-quality git
> history has, in my experience at least, been extremely valueable to
> track down the really ugly issues and legalese around all the
> established precendence.

Again, no one is suggesting that we have shitty changelogs or that we
add comments.  Not sure why you brought that up.

> - Rebase hell due to ongoing other work. Thus far I've only tried to
> help here by rechecking/delaying refactoring patches while big
> features are pending. I think we need to try new approaches here and
> imo better planing should help. E.g. the initial modeset refactor was
> way too big and a monolithic junk that I've just wrestled in by
> exorting r-b tags from you. In contrast the pipe config rework was
> about equally big, but at any given time only about 30-50 patches
> where outstanding (in extreme cases), and mutliple people contributed
> different parts of the overall beast. Of course that means that
> occasional, for really big stuff, we need to plan to write a first
> proof of concept as a landmark where we need to go to, which pretty
> much will be thrown away completely.

This is the real issue.  We don't have enough people to burn on
single features for 6 months each so they can be rewritten 3 times until
they look how you would have done it.  If we keep doing that, you
may as well write all of it, and we'll be stuck in my <rant> from
the previous message.  That's why I suggested the two reviewed-by tags
ought to be sufficient as a merge criteria.  Sure, there may be room
for refactoring, but if things are understandable by other developers
and well tested, why block them?

> One meta-comment on top of the actual discussion: I really appreciate
> critique and I've grown a good maintainer-skin to also deal with
> really harsh critique. But I prefer less ranting and more concrete
> examples where I've botched the job (there are plentiful to pick from
> imo) and concrete suggestion for how to improve our overall process.

I've suggested some already, but they've fallen on deaf ears afaict.  I
don't know what more I can do to convince you that you acting as a
review/refactor bottleneck actively undermines the goals I think we
share.

But I'm done with this thread.  Maybe others want to comment on things
they might think improve the situation.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-08-05 23:34                     ` Jesse Barnes
@ 2013-08-06  6:29                       ` Daniel Vetter
  2013-08-06 14:50                         ` Paulo Zanoni
  0 siblings, 1 reply; 48+ messages in thread
From: Daniel Vetter @ 2013-08-06  6:29 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Ben Widawsky, Intel GFX

Like I've said in my previous mail I expect such discussions to be
hard and I also think stopping now and giving up is the wrong
approach. So another round.

On Tue, Aug 6, 2013 at 1:34 AM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> On Tue, 6 Aug 2013 00:19:33 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
>> The only thing I read here, both in the paragraph above and in the
>> rant is that we suck. I agree. My opinion is that this is because
>> we've started late, had too few resources and didn't seriously
>> estimate how much work is actually involved to enable something for
>> real.
>
> No, it's more than that, we suck in very specific ways:
>   1) large (and sometimes even small) features take waaay too long to
>      land upstream, taking valuable developer time away from other
>      things like bug fixing, regressions, etc
>   2) hw support lands late, which makes it harder to get debug traction
>      with tough bugs (e.g. RC6)
>
>>
>> The only reason I could distill from the above two paragraphs among
>> the ranting for way we are so much late is "So while our code might
>> look pretty, it's late and buggy". That's imo a farily shallow stab at
>> preceived bikesheds, but not a useful angle to improve process.
>
> No, I suggested improvements to our process earlier, and it sounded
> like you mostly agreed, though seemed to deny point that we spin for
> too long on things (point #1 above).

I'll cover your process suggestions below, since some of your
clarifications below shine a different light onto them. But overall I
agree, even that we seem to spin sometimes awfully long.

>> Now I agree that I uphold a fairly high quality standard for
>> upstreaming, but not an unreasonable one:
>> - drm/i915 transformed from the undisputed shittiest driver in the
>> kernel to one that mostly just works, while picking up development
>> pace. So I don't think I'm fully wrong on insisting on this level of
>> quality.
>> - we do ship the driver essentially continously, which means we can
>> implement features only by small refactoring steps. That clearly
>> involves more work than just stitching something together for a
>> product.
>
> <sarcasm>
> You're way off base here.  We should ship a shitty driver and just land
> everything without review or testing.  That way we can go really fast.
> Your quality standards are too high (in that they exist at all).
> </sarcasm>
>
> More seriously, quality should be measured by the end result in terms
> of bugs and how users actually use our stuff.  I'm not sure if that's
> what you mean by a "high quality standard".  Sometimes it seems you
> care more about refactoring things ad-infinitum than tested code.

I often throw in a refactoring suggestion when people work on a
feature, that's right. Often it is also a crappy idea, but imo for
long-term maintainance a neat&tidy codebase is really important. So
I'll just throw them out and see what sticks with people.

I realize that pretty much all of the quality standard discussion here
is really fluffy, but like I explained I get to bear a large part of
the "keep it going" workload. And as long as that's the case I frankly
think my standards carry more weight. Furthermore in the cases where
other people from our team chip in with bugfixing that's mostly in
cases where a self-check or testcase clearly puts the blame on them.
So if that is the only way to volunteer people I'll keep asking for
those things (and delay patches indefinitely like e.g. your fastboot
stuff).

And like I've said I'm open to discuss those requirements, but I
freely admit that I have a rather solid ground resolve on this topic.

>> Also note that Chris&me still bear the brute of fixing the random
>> fallout all over (it's getting better). So if any proposed changes
>> involves me blowing through even more time to track down issues I'm
>> strongly not in favour. Same holds for Chris often-heard comment that
>> a patch needs an improved commit message or a comment somewhere. Yes
>> it's annoying that you need to resend it (this often bugs me myself)
>> just to paint the bikeshed a bit different. But imo Chris is pretty
>> much throughout spot-on with his requests and a high-quality git
>> history has, in my experience at least, been extremely valueable to
>> track down the really ugly issues and legalese around all the
>> established precendence.
>
> Again, no one is suggesting that we have shitty changelogs or that we
> add comments.  Not sure why you brought that up.

I added it since I've just read through some of the patches on the
android internal branch yesterday and a lot of those patches fall
through on the "good enough commit message" criterion (mostly by
failing to explain why the patch is needed). I've figured that's
relevant since on internal irc you've said even pushing fixes to
upstream is a PITA since they require 2-3 rounds to get in.

To keep things concrete one such example is Kamal's recent rc6 fix
where I've asked for a different approach and he sounded rather pissed
that I don't just take his patch as-is. But after I've explained my
reasoning he seemed to agree, at least he sent out a revised version.
And the changes have all been what I guess you'd call bikesheds, since
it was just shuffling the code logic around a bit and pimping the
commit message. I claim that this is worth it and I think your stance
is that we shouldn't delay patches like this. Or is this a bad example
for a patch which you think was unduly delayed? Please bring another
one up in this case, I really think process discussions are easier
with concrete examples.

>> - Rebase hell due to ongoing other work. Thus far I've only tried to
>> help here by rechecking/delaying refactoring patches while big
>> features are pending. I think we need to try new approaches here and
>> imo better planing should help. E.g. the initial modeset refactor was
>> way too big and a monolithic junk that I've just wrestled in by
>> exorting r-b tags from you. In contrast the pipe config rework was
>> about equally big, but at any given time only about 30-50 patches
>> where outstanding (in extreme cases), and mutliple people contributed
>> different parts of the overall beast. Of course that means that
>> occasional, for really big stuff, we need to plan to write a first
>> proof of concept as a landmark where we need to go to, which pretty
>> much will be thrown away completely.
>
> This is the real issue.  We don't have enough people to burn on
> single features for 6 months each so they can be rewritten 3 times until
> they look how you would have done it.  If we keep doing that, you
> may as well write all of it, and we'll be stuck in my <rant> from
> the previous message.  That's why I suggested the two reviewed-by tags
> ought to be sufficient as a merge criteria.  Sure, there may be room
> for refactoring, but if things are understandable by other developers
> and well tested, why block them?

I'd like to see an example here for something that I blocked, really.
One I could think up is the ips feature from Paulo where I've asked to
convert it over to the pipe config tracking. But I asked for that
specifically so that one of our giant long-term feature goals (atomic
modeset) doesn't move further away, so I think for our long-term aims
this request was justified.

Otherwise I have a hard time coming up with features that had r-b tags
from one of the domain expert you've listed (i.e. where understood
well) and I blocked them. It's true that I often spot something small
when applying a patch, but I also often fix it up while applying
(mostly adding notes to the commit message) or asking for a quick
follow-up fixup patch.

>> One meta-comment on top of the actual discussion: I really appreciate
>> critique and I've grown a good maintainer-skin to also deal with
>> really harsh critique. But I prefer less ranting and more concrete
>> examples where I've botched the job (there are plentiful to pick from
>> imo) and concrete suggestion for how to improve our overall process.
>
> I've suggested some already, but they've fallen on deaf ears afaict.

Your above clarification that the 2 r-b tags (one from the domain
expert) should overrule my concern imo makes your original proposal a
bit different - my impression was that you've asked for 2 r-b tags,
period. Which would be more than what we currently have, and since we
have a hard time doing even that would imo I think asking for 2 r-b
tags is completely unrealistic.

One prime example is Ville's watermark patches, which have been ready
(he only did a very few v2 versions for bikesheds) since over a month
ago. But stuck since no one bothered to review them.

So your suggestions (points 1) thru 4) in your original mail in this
thread) haven't fallen on deaf ears. Specifically wrt review from
domain experts I'm ok with just an informal ack and letting someone
else do the detailed review. That way the 2nd function of reviewing of
diffusing knowledge in our distributed team works better when I pick
non-domain-experts.

> I don't know what more I can do to convince you that you acting as a
> review/refactor bottleneck actively undermines the goals I think we
> share.

I disagree that I'm a bottleneck. Just yesterday I've merged roughly
50 patches because they where all nicely reviewed. And like I've said
some of those patches have been stuck for a month in
no-one-bothers-to-review-them limbo land.

If we drag out another example and look at the ppgtt stuff from Ben
which I've asked to be reworked quite a bit. Now one mistake I've done
is to be way too optimistic about how much time this will take when
hashing out a merge plan with Ben. I've committed the mistake of
trying to fit the work that I think needs to be done into the
available time Ben has and so done the same wishful thinking planning
I complain about all the time. Next time around I'll try to make an
honest plan first and then try to fit it into the time we have instead
of the other way round.

But I really think the rework was required since with the original
patch series I was often left with the nagging feeling that I just
don't understand what's going on, and whether I'd really be able to
track down a regression if it bisected to one of the patches. So I
couldn't slap an honset r-b tag onto it. The new series is imo great
and a joy to review.

So again please bring up an example where I've failed and we can look
at it and figure out what needs to change to improve the process. Imo
those little patches and adjustements to our process are the way
forward. At least that approach worked really well for beating our
kernel QA process into shape. And yes, it's tedious and results will
take time to show up.

> But I'm done with this thread.  Maybe others want to comment on things
> they might think improve the situation.

I'm not letting you off the hook that easily ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-08-06  6:29                       ` Daniel Vetter
@ 2013-08-06 14:50                         ` Paulo Zanoni
  2013-08-06 17:06                           ` Daniel Vetter
  2013-08-06 23:28                           ` Dave Airlie
  0 siblings, 2 replies; 48+ messages in thread
From: Paulo Zanoni @ 2013-08-06 14:50 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Ben Widawsky, Intel GFX

Hi

A few direct responses and my 2 cents at the end. This is all my
humble opinion, feel free to disagree or ignore it :)

2013/8/6 Daniel Vetter <daniel@ffwll.ch>:
>
> I often throw in a refactoring suggestion when people work on a
> feature, that's right. Often it is also a crappy idea, but imo for
> long-term maintainance a neat&tidy codebase is really important. So
> I'll just throw them out and see what sticks with people.
>

The problem is that if you throw and it doesn't stick, then people
feel you won't merge it. So they kinda feel they have to do it all the
time.

Another thing is that sometimes the refactoring is just plain
bikeshedding, and that leads to demotivated workers. People write
things on their way, but then they are forced to do it in another way,
which is also correct, but just different, and wastes a lot of time.
And I'm not talking specifically about Daniel's suggestions, everybody
does this kind of bikeshedding (well, I'm sure I do). If someone gives
a bikeshed to a patch, Daniel will see there's an unattended review
comment and will not merge the patch at all, so basically a random
reviewer can easily block someone else's patch. I guess we all should
try to give less bikeshedding, including me.

>
> One prime example is Ville's watermark patches, which have been ready
> (he only did a very few v2 versions for bikesheds) since over a month
> ago. But stuck since no one bothered to review them.

Actually I subscribed myself to review (on review board) and purposely
waited until he was back from vacation before I would start the
review. I also did early 0-day testing on real hardware, which is IMHO
way much more useful than just reviewing. Something that happened many
times for me in the past: I reviewed a patch, thought it was correct,
then decided to boot the patch before sending the R-B email and found
a bug.


And my 2 cents:

Daniel and Jesse are based on different premises, which means they
will basically discuss forever until they realize that.

In an exaggerated view, Daniel's premises:
- Merging patches with bugs is unacceptable
  - Colorary: users should never have to report bugs/regressions
- Delaying patch merging due to refactoring or review comments will
always make it better

In the same exaggerated view, Jesse's premises:
- Actual user/developer testing is more valuable than review and refactoring
  - Colorary: merging code with bugs is acceptable, we want the bug reports
- Endless code churn due to review/refactoring may actually introduce
bugs not present in the first version

Please tell me if I'm wrong.

>From my point of view, this is all about tradeoffs and you two stand
on different positions in these tradeoffs. Example:
- Time time you save by not doing all the refactoring/bikeshedding can
be spent doing bug fixing or reviewing/testing someone else's patches.
  - But the question is: which one is more worth it? An hour
refactoring/rebasing so the code behaves exactly like $reviewer wants,
or an hour staring at bugzilla or reviewing/testing patches?
  - From my point of view, it seems Daniel assumes people will always
spend 0 time fixing bugs, that's why he requests people so much
refactoring: the tradeoff slider is completely at one side. But that's
kind of a vicious/virtuous cycle: the more he increases his "quality
standards", the more we'll spend time on the refactorings, so we'll
spend even less time on bugzilla", so Daniel will increase the
standards even more due to even less time spent on bugzilla, and so
on.

One thing which we didn't discuss explicitly right now  and IMHO is
important is how people *feel* about all this. It seems to me that the
current amount of reworking required is making some people (e.g.,
Jesse, Ben) demotivated and unhappy. While this is not really a
measurable thing, I'm sure it negatively affects the rate we improve
our code base and fix our bugs. If we bikeshed a feature to the point
where the author gets fed up with it and just wants it to get merged,
there's a high chance that future bugs discovered on this feature
won't be solved that quickly due the stressful experience the author
had with the feature. And sometimes the unavoidable "I'll just
implement whatever review comments I get because I'm so tired about
this series and now I just want to get it merged" attitude is a very
nice way to introduce bugs.

And one more thing. IMHO this discussion should all be on how we deal
with the people on our team, who get paid to write this code. When
external people contribute patches to us, IMHO we should give them big
thanks, send emails with many smileys, and hold all our spotted
bikesheds to separate patches that we'll send later. Too high quality
standards doesn't seem to be a good way to encourage people who don't
dominate our code base.


My possible suggestions:

- We already have drm-intel-next-queued as a barrier to protect
against bugs in merged patches (it's a barrier to drm-intel-next,
which external people should be using). Even though I do not spend
that much time on bugzilla bugs, I do rebase on dinq/nightly every day
and try to make sure all the regressions I spot are fixed, and I count
this as "bug fixing time". What if we resist our OCDs and urge to
request reworks, then merge patches to dinq more often? To compensate
for this, if anybody reports a single problem in a patch or series
present on dinq, it gets immediately reverted (which means dinq will
either do lots of rebasing or contain many many reverts). And we try
to keep drm-intel-next away from all the dinq madness. Does that sound
maintainable?
- Another idea I already gave a few times is to accept features more
easily, but leave them disabled by default until all the required
reworks are there. Daniel rejected this idea because he feels people
won't do the reworks and will leave the feature disabled by default
forever. My counter-argument: 99% of the features we do are somehow
tracked by PMs, we should make sure the PMs know features are still
disabled, and perhaps open sub-tasks on the feature tracking systems
to document that the feature is not yet completed since it's not
enabled by default.

In other words: this problem is too hard, it's about tradeoffs and
there's no perfect solution that will please everybody.

My just 2 cents, I hope to not have offended anybody :(

Cheers,
Paulo

-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-08-06 14:50                         ` Paulo Zanoni
@ 2013-08-06 17:06                           ` Daniel Vetter
  2013-08-06 23:28                           ` Dave Airlie
  1 sibling, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2013-08-06 17:06 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: Ben Widawsky, Intel GFX

On Tue, Aug 6, 2013 at 4:50 PM, Paulo Zanoni <przanoni@gmail.com> wrote:
> A few direct responses and my 2 cents at the end. This is all my
> humble opinion, feel free to disagree or ignore it :)

I think you make some excellent points, so thanks a lot for joining
the discussion.

> 2013/8/6 Daniel Vetter <daniel@ffwll.ch>:
>>
>> I often throw in a refactoring suggestion when people work on a
>> feature, that's right. Often it is also a crappy idea, but imo for
>> long-term maintainance a neat&tidy codebase is really important. So
>> I'll just throw them out and see what sticks with people.
>>
>
> The problem is that if you throw and it doesn't stick, then people
> feel you won't merge it. So they kinda feel they have to do it all the
> time.
>
> Another thing is that sometimes the refactoring is just plain
> bikeshedding, and that leads to demotivated workers. People write
> things on their way, but then they are forced to do it in another way,
> which is also correct, but just different, and wastes a lot of time.
> And I'm not talking specifically about Daniel's suggestions, everybody
> does this kind of bikeshedding (well, I'm sure I do). If someone gives
> a bikeshed to a patch, Daniel will see there's an unattended review
> comment and will not merge the patch at all, so basically a random
> reviewer can easily block someone else's patch. I guess we all should
> try to give less bikeshedding, including me.

Yeah, that happens. With all the stuff going on I relly can't keep
track of everything, so if it looks like the patch author and the
reviewer are still going back&forth I just wait. And like I've
explained in private once I don't like stepping in as the maintainer
when this happens since I'm not the topic expert by far, so my
assessment will be about as good as a coin-toss. Of course if the
question centers around integration issues with the overall codebase
I'll happily chime in.

I think the only way to reduce time wasted in such stuck discussions
is to admit that the best solution isn't clear and that adding a fixme
comment somewhere to look at the issue again for the next platform
(bug, regression, feature, ...) that touches the same area. Or maybe
reconsider once everything has landed and it's clear what then
end-result really looks like.

>> One prime example is Ville's watermark patches, which have been ready
>> (he only did a very few v2 versions for bikesheds) since over a month
>> ago. But stuck since no one bothered to review them.
>
> Actually I subscribed myself to review (on review board) and purposely
> waited until he was back from vacation before I would start the
> review. I also did early 0-day testing on real hardware, which is IMHO
> way much more useful than just reviewing. Something that happened many
> times for me in the past: I reviewed a patch, thought it was correct,
> then decided to boot the patch before sending the R-B email and found
> a bug.

Imo review shouldn't require you to apply the patches and test them.
Of course if it helps you to convince yourself the patch is good I'm
fine with that approach. But myself if I have doubts I prefer to check
whether a testcase/selfcheck exists to exercise that corner case (and
so will prevent this from also ever breaking again). Testing itself
should be done by the developer (or bug reporter). Hopefully the
developer patch test system that QA is now rolling out will help a lot
in that regard.

> And my 2 cents:
>
> Daniel and Jesse are based on different premises, which means they
> will basically discuss forever until they realize that.
>
> In an exaggerated view, Daniel's premises:
> - Merging patches with bugs is unacceptable
>   - Colorary: users should never have to report bugs/regressions
> - Delaying patch merging due to refactoring or review comments will
> always make it better
>
> In the same exaggerated view, Jesse's premises:
> - Actual user/developer testing is more valuable than review and refactoring
>   - Colorary: merging code with bugs is acceptable, we want the bug reports
> - Endless code churn due to review/refactoring may actually introduce
> bugs not present in the first version
>
> Please tell me if I'm wrong.

At least from my pov I think this is a very accurate description of
our different assumptions and how that shapes how we perceive these
process issues.

> From my point of view, this is all about tradeoffs and you two stand
> on different positions in these tradeoffs. Example:
> - Time time you save by not doing all the refactoring/bikeshedding can
> be spent doing bug fixing or reviewing/testing someone else's patches.
>   - But the question is: which one is more worth it? An hour
> refactoring/rebasing so the code behaves exactly like $reviewer wants,
> or an hour staring at bugzilla or reviewing/testing patches?
>   - From my point of view, it seems Daniel assumes people will always
> spend 0 time fixing bugs, that's why he requests people so much
> refactoring: the tradeoff slider is completely at one side. But that's
> kind of a vicious/virtuous cycle: the more he increases his "quality
> standards", the more we'll spend time on the refactorings, so we'll
> spend even less time on bugzilla", so Daniel will increase the
> standards even more due to even less time spent on bugzilla, and so
> on.

tbh I haven't considered that I might cause a negative feedback cycle here.

One thing that seems to work (at least for me) is when we have good
testcase. With QA's much improved regression reporting I can then
directly assign a bug to the patch auther of the offending commit.
That seems to help a lot in distributing the regression handling work.

But more tests aren't a magic solution since they also take a lot of
time to write. And in a few areas our test coverage gaps are still so
big that relying on tests only for good quality and much less on
clean&clear code which is easy to review isn't really a workable
approach. But I'd be willing to trade off more tests for less bikeshed
in review since imo the two parts are at least partial substitutes.
Thus far though writing tests seems to often come as an afterthough
and not as the first thing, so I guess this doesn't work too well with
our current team. Personally I don't like writing testcases too much,
even though it's fun to blow up the kernel ;-) And it often helps a
_lot_ with understanding the exact nature of a bug/issue, at least for
me.

Another approach could be if developers try to proactively work a bit
on issues in they're area and take active ownership, I'm much more
inclined to just merge patches in this case. Examples are how Jani
wrestles around with the backlight code or how you constantly hunt
down unclaimed register issues. Unfortunately that requires that
people follow the bugspam and m-l mail flood, which is a major time
drain :(

> One thing which we didn't discuss explicitly right now  and IMHO is
> important is how people *feel* about all this. It seems to me that the
> current amount of reworking required is making some people (e.g.,
> Jesse, Ben) demotivated and unhappy. While this is not really a
> measurable thing, I'm sure it negatively affects the rate we improve
> our code base and fix our bugs. If we bikeshed a feature to the point
> where the author gets fed up with it and just wants it to get merged,
> there's a high chance that future bugs discovered on this feature
> won't be solved that quickly due the stressful experience the author
> had with the feature. And sometimes the unavoidable "I'll just
> implement whatever review comments I get because I'm so tired about
> this series and now I just want to get it merged" attitude is a very
> nice way to introduce bugs.

Yep, people are the most important thing, technical issues can usually
be solved much easier. Maybe we need to look for different approaches
that suit people better (everyone's a bit different), like the idea
above to emphasis tests more instead of code cleanliness and
consistency. E.g. for your current pc8+ stuff I've somewhat decided
that I'm not going to drop bikesheds, but just make sure the testcase
looks good. Well throw a few ideas around while reading the patches,
but those are just ideas ... again a case I guess where you can
mistake my suggestions as requirements :(

I need to work on making such idea-throwing clearer.

Otherwise I'm running a bit low on ideas how we could change the patch
polishing for upstream to better suit people and prevent fatalistic
"this isn't really my work anymore" resingation. Ideas?

> And one more thing. IMHO this discussion should all be on how we deal
> with the people on our team, who get paid to write this code. When
> external people contribute patches to us, IMHO we should give them big
> thanks, send emails with many smileys, and hold all our spotted
> bikesheds to separate patches that we'll send later. Too high quality
> standards doesn't seem to be a good way to encourage people who don't
> dominate our code base.

I disagree. External contributions should follow the same standards as
our own code. And just because we're paid to do this doesn't mean I
won't be really happy about a tricky bugfix or a cool feature. Afaic
remember the only non-intel feature that was merged that imo didn't
live up to my standards was the initial i915 prime support from Dave.
And I've clearly stated that I won't merge the patch through my tree
and listed the reasons why I think it's not ready.

> My possible suggestions:
>
> - We already have drm-intel-next-queued as a barrier to protect
> against bugs in merged patches (it's a barrier to drm-intel-next,
> which external people should be using). Even though I do not spend
> that much time on bugzilla bugs, I do rebase on dinq/nightly every day
> and try to make sure all the regressions I spot are fixed, and I count
> this as "bug fixing time". What if we resist our OCDs and urge to
> request reworks, then merge patches to dinq more often? To compensate
> for this, if anybody reports a single problem in a patch or series
> present on dinq, it gets immediately reverted (which means dinq will
> either do lots of rebasing or contain many many reverts). And we try
> to keep drm-intel-next away from all the dinq madness. Does that sound
> maintainable?

I occasionally botch a revert/merge/rebase and since it wouldn't scale
when I ask people to cross check my tree in detail every time (or
people just assume I didn't botch it) those slip out. So I prefer if I
don't have to maintain more volatile trees.

I'm also not terribly in favour of merging stuff early and hoping for
reworks since often the attention moves immediately to the next thing.
E.g. VECS support was merged after a long delay when finally some
basic tests popped up. But then a slight change from Mika to better
exercise some seqno wrap/gpu reset corner cases showed that semaphores
don't work with VECS. QA dutifully reported this bug and Chris
analysis the gpu hang state. Ever since then this was ignored. So I
somewhat agree with Dave here, at least sometimes ...

I'm also not sure that an immediate revert rule is the right approach.
Often an issue is just minor (e.g. the modeset state checker trips
up), dropping the patch right away might be the wrong approach. Of
course if something doesn't get fixed quickly that's not great,
either.

> - Another idea I already gave a few times is to accept features more
> easily, but leave them disabled by default until all the required
> reworks are there. Daniel rejected this idea because he feels people
> won't do the reworks and will leave the feature disabled by default
> forever. My counter-argument: 99% of the features we do are somehow
> tracked by PMs, we should make sure the PMs know features are still
> disabled, and perhaps open sub-tasks on the feature tracking systems
> to document that the feature is not yet completed since it's not
> enabled by default.

I'm not sure how much that would help. If something is disabled by
default it won't getting beaten on by QA. And Jesse is right that we
just need that coverage, but to discover corner case bugs but also to
ensure a feature doesn't regress. If we merge something disabled by
default I fear it'll bitrot as quickly as an unmerged patch series.
But we leave in the delusion that it all still works. So I'm not sure
it's a good approach, but with psr we kinda have this as a real-world
experiment running. Let's see how it goes ...

> In other words: this problem is too hard, it's about tradeoffs and
> there's no perfect solution that will please everybody.

Yeah, I think your approach of clearly stating this as a tradeoff
issue cleared up things a lot for me. I think we need to actively hunt
for opportunities and new ideas. I've added a few of my own above, but
I think it's clear that there's no silver bullet.

One idea I'm pondering is whether a much more detailed breakdown of a
task/feature/... and how to get the test coverage and all the parts
merged could help. At least from my pov a big part of the frustration
seems to stem from the fact that the upstreaming process is highly
variable, and like I've said a few times I think we can do much
better. At least once we've tried this a few times and have some
experience. But again this is not for free but involves quite some
work. And I guess I need to be highly involved or even do large parts
of that break-down to make sure nothing gets missed, and I kinda don't
want to sign up for that work ;-)

> My just 2 cents, I hope to not have offended anybody :(

Not at all, and I think your input has been very valuable to the discussion.

Thanks a lot,
Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)
  2013-08-06 14:50                         ` Paulo Zanoni
  2013-08-06 17:06                           ` Daniel Vetter
@ 2013-08-06 23:28                           ` Dave Airlie
  1 sibling, 0 replies; 48+ messages in thread
From: Dave Airlie @ 2013-08-06 23:28 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: Ben Widawsky, Intel GFX

>
> In the same exaggerated view, Jesse's premises:
> - Actual user/developer testing is more valuable than review and refactoring
>   - Colorary: merging code with bugs is acceptable, we want the bug reports
> - Endless code churn due to review/refactoring may actually introduce
> bugs not present in the first version
>
> Please tell me if I'm wrong.
>
> From my point of view, this is all about tradeoffs and you two stand
> on different positions in these tradeoffs. Example:
> - Time time you save by not doing all the refactoring/bikeshedding can
> be spent doing bug fixing or reviewing/testing someone else's patches.
>   - But the question is: which one is more worth it? An hour
> refactoring/rebasing so the code behaves exactly like $reviewer wants,
> or an hour staring at bugzilla or reviewing/testing patches?
>   - From my point of view, it seems Daniel assumes people will always
> spend 0 time fixing bugs, that's why he requests people so much
> refactoring: the tradeoff slider is completely at one side. But that's
> kind of a vicious/virtuous cycle: the more he increases his "quality
> standards", the more we'll spend time on the refactorings, so we'll
> spend even less time on bugzilla", so Daniel will increase the
> standards even more due to even less time spent on bugzilla, and so
> on.

Here is the thing, before Daniel started making people write tests and
bikeshedding,
people spent 0 time on bugs, I can dig up countless times now I've had
RHEL regressions
that I've had to stop merging code to get anyone to look at.

So Jesse likes to think that people will have more time to look at
bugzilla if they aren't
refactoring patches, but generally I find people will just get moved
onto the next task the
second the code is merged by Daniel, and will fight against taking any
responsibility for
code that is already merged unless hit with a big stick.

This is just ingrained in how people work, doing new shiny stuff is
always more fun than
spending 4 days or weeks to send a one liner patch, so really if
people thinking we just need to merge
most stuff faster is the solution they are delusional, and I'll gladly
stop pulling until they stop.

I've spent 2-3 weeks on single bugs in the graphics stack before and
I'm sure I will again, but the incentive to go
hunting for them generally comes from someone important reporting the
bug, not from a misc bug report
in bugzilla from someone who isn't a monetary concern. So Jesse if you
really believe the team will focus on bugs
2-3 months after the code is merged and drop their priority for
merging whatever cool feature they are on now, then
maybe I'd agree, but so far history has shown this never happens.

Dave.

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2013-08-06 23:28 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-22  2:08 [PATCH 00/12] Completion of i915 VMAs Ben Widawsky
2013-07-22  2:08 ` [PATCH 01/12] drm/i915: plumb VM into object operations Ben Widawsky
2013-07-23 16:37   ` Daniel Vetter
2013-07-26  9:51   ` Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations) Daniel Vetter
2013-07-26 16:59     ` Jesse Barnes
2013-07-26 17:08       ` Chris Wilson
2013-07-26 17:12         ` Jesse Barnes
2013-08-04 20:31           ` Daniel Vetter
2013-07-26 17:40       ` Daniel Vetter
2013-07-26 20:15     ` Ben Widawsky
2013-07-26 20:43       ` Daniel Vetter
2013-07-26 23:13         ` Dave Airlie
2013-07-27  0:05           ` Ben Widawsky
2013-07-27  8:52             ` Dave Airlie
2013-08-04 19:55               ` Daniel Vetter
2013-07-29 22:35           ` Jesse Barnes
2013-07-29 23:50             ` Dave Airlie
2013-08-04 20:17               ` Daniel Vetter
2013-08-05 21:33                 ` Jesse Barnes
2013-08-05 22:19                   ` Daniel Vetter
2013-08-05 23:34                     ` Jesse Barnes
2013-08-06  6:29                       ` Daniel Vetter
2013-08-06 14:50                         ` Paulo Zanoni
2013-08-06 17:06                           ` Daniel Vetter
2013-08-06 23:28                           ` Dave Airlie
2013-07-22  2:08 ` [PATCH 02/12] drm/i915: Fix up map and fenceable for VMA Ben Widawsky
2013-07-23 16:42   ` Daniel Vetter
2013-07-23 18:14     ` Ben Widawsky
2013-07-22  2:08 ` [PATCH 03/12] drm/i915: Update error capture for VMs Ben Widawsky
2013-07-22  2:08 ` [PATCH 04/12] drm/i915: Track active by VMA instead of object Ben Widawsky
2013-07-23 16:48   ` Daniel Vetter
2013-07-26 21:48     ` Ben Widawsky
2013-07-22  2:08 ` [PATCH 05/12] drm/i915: Add map/unmap object functions to VM Ben Widawsky
2013-07-22  2:08 ` [PATCH 06/12] drm/i915: Use the new vm [un]bind functions Ben Widawsky
2013-07-23 16:54   ` Daniel Vetter
2013-07-26 21:48     ` Ben Widawsky
2013-07-26 21:56       ` Daniel Vetter
2013-07-22  2:08 ` [PATCH 07/12] drm/i915: eliminate vm->insert_entries() Ben Widawsky
2013-07-23 16:57   ` Daniel Vetter
2013-07-22  2:08 ` [PATCH 08/12] drm/i915: Add vma to list at creation Ben Widawsky
2013-07-22  2:08 ` [PATCH 09/12] drm/i915: create vmas at execbuf Ben Widawsky
2013-07-22 13:32   ` Chris Wilson
2013-07-22  2:08 ` [PATCH 10/12] drm/i915: Convert execbuf code to use vmas Ben Widawsky
2013-07-22  2:08 ` [PATCH 11/12] drm/i915: Convert object coloring to VMA Ben Widawsky
2013-07-23 17:07   ` Daniel Vetter
2013-07-22  2:08 ` [PATCH 12/12] drm/i915: Convert active API " Ben Widawsky
2013-07-22 10:42 ` [PATCH 00/12] Completion of i915 VMAs Chris Wilson
2013-07-22 16:35   ` Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.